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Abstract 

This research is concerned with designing representations for analytical reasoning 
problems (of the sort found on the GRE and LSAT). These problems are intended 
to test the ability to draw logical conclusions from information presented and to 
synthesize that information in order to deduce interrelationships. 

A computer program was developed that takes as input a straightforward predicate 
calculus translation of a problem, requests additional information if necessary, decides 
what to represent and how, designs representations that capture the constraints of 
the problem, and finally, creates and executes a LISP program that uses those repre- 
sentations to produce a solution. Even though these problems are typically difficult 
for theorem provers to solve, the LISP program that uses the designed representations 
is very efficient. 

The representations designed by the system are powerful because they capture the 
constraints of a problem in two ways, (i) The structure of the representation resembles 
the structure of the thing represented. For example, consider representing married 
couples as sets of size two. The structure of this representation resembles the structure 
of married couples because both have exactly two individuals in them, (ii) The 
structure enables efficient behaviors that enforce a problem's constraints by keeping 
them invariant in the structure. For example, as a set representing a married couple 
is manipulated, a behavior associated with it maintains its fixed size. This behavior 
efficiently enforces the size constraint on married couples. 

The system designs a representation that captures as many of the constraints of a 
problem as possible. When a representation captures more constraints fewer sets of 
facts can be expressed in it. For example, in a representation that does not capture 
the symmetry of the married relation, one can state "A is married to B" and "B is 
not married to A." However, it is not possible to express these two statements in a 



representation that captures symmetry. Allowing fewer sets of facts to be stated in 
a representation reduces the space that a problem solver must consider; this in turn 
results in more efficient problem solving behavior. 

Representation design consists of three processes: classification, concept introduc- 
tion, and operationalization. Classification uses a library of structures each capturing 
different kinds of constraints. It finds the most specialized library structure for rep- 
resenting each concept in a problem. Concept introduction is a way of enhancing 
classification. When classification fails to capture all the constraints on a concept, 
introduction tries different ways to represent the concept. For example, sometimes 
when classification fails to capture all the constraints on a relation like "married," 
it introduces a concept like "couple," a function mapping an individual to the set of 
individuals he/she is married to. Classification of "couple" results in a representation 
that captures more constraints than did the representation of •'married." 

Classification and concept introduction run as coroutines, trying to capture all of the 
constraints of a problem. As they do this, the statements of captured constraints get 
removed from the problem. However, in many problems these processes fail to capture 
all of the constraints of a problem, leaving statements of the uncaptured constraints. 
Operationalization then tries to capture the constraints of the remaining statements 
by writing procedures and using these to further specialize the representations created 
by the previous processes. 

The demonstration system has designed representations for twelve analytical rea- 
soning problems. In each case, designing a representation and using it to solve the 
problem has proven to be far more efficient than using a binary resolution theorem 
prover to search for a solution in the initial representation. 
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Chapter 1 



Introduction 



It has long been acknowledged that good representations are key in effective prob- 
lem solving. But what is a "good" representation? Most answers fall back on a 
collection of somewhat vague phrases, such as "make the important things explicit; 
expose natural constraints; be complete, concise, transparent; facilitate computa- 
tion" [Winston84]. These are of some assistance, but leave unresolved at least two 
important issues. First, saying that a "good" representation makes the "important" 
things explicit really only relabels the phenomenon — How are we to know what is 
important? Second, while phrases like these can conceivably serve as recognizers of 
good representations, little progress has been made on understanding how to design 
a good representation prospectively. 

I have developed a new approach to designing good representations. The approach 
has the following key features: 

• It begins with an initial problem statement. It determines what to represent and 
assists in identifying missing information that is required to solve the problem. 
Thus it assists in determining what is "important" in a problem statement. 

• It offers a technical explanation of what makes for a good representation, claim- 
- ing that it is one that captures constraints of a problem directly in its structure 

and behavior rather than leaving the constraints to be enforced by the problem 
solver using that representation. 

• It shows how to design a good representation, and then how to solve the problem 
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using that representation. 

I have implemented a demonstration of this approach, which I call the representation 
design system, and have tested it on a small number of verbal reasoning problems 
of the sort found on graduate level standardized admissions tests. One of these 
problems, shown in Figure 1.1, is used in this thesis for illustration. This problem 
will be referred to as the FAMILIES problem. 

The system takes as input a straightforward predicate calculus translation of a prob- 
lem, requests additional information if necessary, decides what to represent and how, 
designs representations tailored to the problem, and finally creates and executes a 
LISP program that uses those representations to produce a solution. 

Given: M. N, O. P, Q, R, and S are all members of the same family. N is 
married to P. S is the grandchild of Q. is the niece of M. The mother of 
S is the only sister of M. R is Q's only child. M has no brothers. N is the 
grandfather of 0. 
Query: Who are the siblings of S. 

Figure 1.1: The FAMILIES Analytical Reasoning Problem 

1.1 Motivation 

My approach is motivated in large part by my own observations of the problem solving 
behavior people exhibit when solving problems of the sort shown in Figure 1.1, and 
inspired by the striking difference between that behavior and what we might call 
the "classroom theorem-proving approach." This approach begins by translating the 
problem into predicate calculus (Figure 1.2). Then a general purpose theorem prover 
(e.g., using binary resolution) is used to search for a solution using the initial problem 
statements. 

One problem with this approach is that the initial problem specification is incomplete: 
nothing in Figure 1.2, for instance, indicates that the married relation is symmetric 
or that grandfather is the father of the father, etc. Once identified, that information 
is easily encoded as additional axioms. The harder part is knowing what is missing: 
on this task the classroom theorem-proving approach offers us little or no guidance. 
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Given: 

M, N, 0, P, Q, R, and S are all 

members of the same family. 

N is married to P. 

S is the grandchild of Q. 

is the niece of M. 

The mother of S is the only 

sister of M. 

R is Q's only child. 

M has no brothers. 

N is the grandfather of 0. 

Query: 

Who are the siblings of S. 



M : family -member, 

. . . , S : family -member 

married{N,P) 

grandchild(S, Q) 

niece(0,M) 

mother(S,x) ■& sister(M,x) 

[sister(M, x) A sister(M, y)] 

=> x = y 
child(Q,x) <& x = R 
~ i brother(M,x) 
grandfather(0, N) 

find-all x \ sibling(S,x) 



Figure 1.2: Line-by-line translation of FAMILIES problem to sorted first order logic. 
(Upper case symbols are zeroary constants. Lower case symbols in variable positions are universally 
quantified. Also note that we include a separate language (to be described later) for expressing 
a problem's queries. The above statement involving find-all is an example of a statement in this 
language. ) 



More important from the perspective taken in this research is that a human problem 
solver would not use an unstructured collection of axioms, but would instead design 
and use representations that capture the constraints of the problem to produce solu- 
tions far more effectively. An example of this type of representation is illustrated in 
Figure 1.3, which shows two statements of the FAMILIES problem in a representation 
people commonly use. 

Such representations are powerful because they capture the constraints of a problem, 
in two ways: (i) the structure of the representation resembles the structure of the 
thing represented (i.e, they are "direct" jSlomanTlj), and (ii) this structure enables 
efficient behaviors that enforce a problem's constraints by keeping them invariant in 
the structure. These are both illustrated here with the example "children-of" link. 

The "children-of" link syntactically captures the 1-1 function between a couple and 
their set of children. There are also specialized behaviors associated with this link 
because it is a 1-1 function. One behavior uses the fact that "children-of" is a function 
to combine children sets when they are the "children-of" the same couple (i.e., if two 
couples are equal, the objects their "children-of" links point to are equal). Because 
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"R is the only child of Q" « S is the grandchi i d of Q" 

Figure 1.3: Two statements in a representation commonly used by people. 
A divided rectangle represents a couple; a circle represents a set of children of the same couple: solid 
circles are closed sets, dashed circles are sets all of whose members may not be known; the directed 
arc represents the "'children-of function between a couple and their set of children. 

it is a 1-1 function, another behavior does the same to couples that are parents of 
the same children sets. 

Using the fact that couples are disjoint, if we have what appears to be two distinct 
couples (the top boxes in Figure 1.3) and also know that they share an individual (Q), 
then they are in fact the same and hence can be combined. Using a behavior that 
embodies this fact and the behaviors associated with "children-of," the two structures 
in Figure 1.3 can be combined to yield Figure 1.4. 

As will be detailed later, this combination process is of fundamental importance be- 
cause it computes all the relevant 1 ground consequences of the conjunction of state- 
ments as they are combined. This process is tightly constrained by the syntax of the 
representations. For instance, Figure 1.4 represents all the relevant consequences of 
the conjunction of the structures in Figure 1.3 (e.g., "S is the child of R"). The conse- 
quences are computed by the behaviors described above performing local operations 
on the two structures. 

The representation design system accepts a problem stated in predicate calculus 
as input. For example, the system is given the predicate calculus version of the 
FAMILIES problem (i.e., those statements shown on the right in Figure 1.2). The task 
of the system is to design representations like Figure 1.4, by picking out the important 
concepts in the problem (such as "couples" or "the siblings of an individual"), and 

Relevance will later be defined in terms of the problem query. 
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Figure 1.4: Composition of the structures in Figure 1.3. 

finding ways to operate on them using special purpose manipulations of the sort 
illustrated by the transformation from Figure 1.3 to Figure 1.4. The system chooses 
what to represent and how, then solves this problem using those representations. 

The demonstration system has been tested on the FAMILIES problem, on two other 
problems shown in Figure 1.5 and Figure 1.6, and on three variations of each. The 
system designs representations for all twelve problems. In each case, designing a 
representation and using it to solve the problem has proven to be far more efficient 
than using a resolution theorem prover to search for a solution in the initial represen- 
tation. For example, representation design for the FAMILIES problem takes twenty 
minutes. 2 A LISP program using the representation is generated in five minutes; it 
solves the problem in less than three seconds. By contrast, missing information was 
added to the problem statement and then a connection graph resolution theorem 
prover was used to solve it. This took 988,442 resolutions — three hours and five 
minutes — to find a solution. 

The representations produced by the system are expressed in terms of abstract data 
types: the data structure of the data type is used to implement the structure part of 
a representation; the procedures associated with the data type are used to implement 
the behaviors. For example, the system designs an ADT for couples in the FAMILIES 
problem. Instances of this type serve roughly the same function as the rectangles in 
Figure 1.3 and Figure 1.4: they represent specific couples like the one containing Q. 
The data structure part of a couple contains two slots for holding the two individuals 
in a couple. There are several procedures associated with couple. One procedure 
combines distinct couples if they have a common member. 



All times given are for programs running on a Symbolics 3600. 
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Eight law professors are housed in a single wing of a building. The wing contains ten 

offices, numbered 1 to 10, in that order; each professor is assigned to a different office, 

and two offices are left empty for use as meeting rooms. The professors are named 

Boswell, Dyer, Garrett, Harrelson, Kranepool, Ryan, Taylor, and Weis. 

Dyer is four offices away from Kranepool. 

There is one empty office and one occupied office between Taylor and Harrelson. 

Ryan is in an office next to Boswell. 

Dyer is in an office next to Garrett. 

Kranepool is between an occupied office and an empty office. 

Weis is in office 2. 

Garrett is in office 7. 

Who is in office 4? 

Which offices are unoccupied? 

Is Ryan, Dyer, Garrett, Taylor a possible sequence of offices? 

Figure 1.5: The PROFESSORS problem 

Note that while my system designs representations in terms of abstract data types, the 
language used to express the design is not so much the issue: much the same effect can 
no doubt be accomplished by a skilled logic programmer carefully selecting axioms, 
lemmas, and special purpose inference rules. Whatever the language, the important 
point is the careful selection of representations that capture the constraints of the 
problem. 



1.2 Overview of Representation Design 

This section provides an overview of the process of representation design. Chapter 2 
discusses the descriptions that the system uses. The four subsequent chapters describe 
in detail the different elements of the representation design process. 

Representation design begins with a problem statement, a collection of statements 
in a sorted first order logic, along with one or more queries written in a separate 
query language described later. The constant symbols in the statements refer to 
what we will call concepts: individuals, relations, and functions. For example, in the 
FAMILIES problem the statement, 

"ix\/y[married{x,y) => married(y,x)}, 
the symbol married refers to the concept "married." 
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A restaurant employs eight waiters, D, E, F, G, H, I, J, and K, each of whom works 

four days a week. The restaurant is open every day except Monday. On Friday and 

Saturday, a staff of six waiters is needed. On all other days when the restaurant is 

open, a staff of five is needed. 

D cannot work on Tuesday or Thursday. 

E cannot work on Wednesday. 

G cannot work on Thursday or Saturday. 

H cannot work on Friday. 

J cannot work on Tuesday or Sunday. 

K cannot work on Wednesday or Friday. 

Is D,E,F,I,K a possible staff of waiters for a Tuesday? 

Is Tuesday, Wednesday, Thursday, Sunday a possible work week for G? 

Figure 1.6: The WAITERS problem 

A familiar notion in logic is that statements constrain the possible models of a prob- 
lem. For example, the above statement constrains all models of the FAMILIES prob- 
lem to be models in which "married" is symmetric. An important part of my method 
is to design representations that are constrained in the same way as the models of a 
problem. 

1.2.1 Representation 

A representation is a mapping between concepts and syntactic structures in a repre- 
sentation language. For instance, the representation of a problem expressed in first 
order logic is the mapping between the concepts mentioned in the problem statement 
and syntactic structures of first order logic. In this case, there are three syntactic 
structures: constants (i.e., symbols that appear as terms), relations (symbols fol- 
lowed by lists of arguments that appear as atomic formulas), and functions (symbols 
followed by lists of arguments that appear as terms). The syntactic category of a 
symbol tells us how the symbol is represented. For example, in the FAMILIES prob- 
lem statement "married" is represented as a relation because the symbol married 
appears in atomic formulas of the form married(termi, tervn-i). 

We say that a representation captures a constraint when, expressed in that represen- 
tation, the constraint follows from the meaning postulates of the representation. For 
example, a function-symbol in first order logic captures the notion of a "function" 
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by capturing the "single valued" constraint (among other things). More precisely, 
the single valuedness of a concept is captured when it is represented as a function 
because single valuedness then follows from the meaning postulate: 3 

VxVyVF[x = y => F(x) = F(y)}. 

Let us consider the sense in which a syntactic structure can be said to capture a con- 
straint. A function symbol captures the single valued constraint with a combination 
of structure and behavior. The structure is function application (i.e., F(x x , . . .,£„)) 
restricted to appear in the place of terms in the logic. The behavior is that of unifi- 
cation allowing F(x) to unify with F(y) only if x unifies with y. 

The methodology described here takes explicit account of the fact that a combination 
of structure and behavior is required to capture constraints. The structures appearing 
in the range of the representations that my system designs have behavior associated 
with them. The structures capture constraints by having features that correspond to 
those constraints. For example, in the FAMILIES representation, a married couple 
is represented as a structure with two slots. The feature, two slots, corresponds to 
the property "married couples have exactly two individuals in them." 

Structures also have procedures associated with them that provide them with be- 
havior enforcing their constraints. 4 For example, the procedures associated with the 
structure for married couples interpret each slot in the structure as containing exactly 
one individual. It is through the combination of structure (two slots) and behavior 
(each slot is interpreted as containing one individual) that structures capture con- 
straints. 

The procedures of a structure enforce its constraints as instances of the structure 
are created and modified. For example, there is a procedure that produces a single 
couple from the two separate couples < A,B > and < A,C >. This procedure 
enforces the fact that couples are disjoint by combining two couples when they share 
an individual. When the new structure is created, another procedure associated with 
couples makes B = C to ensure that the new couple has exactly two members. 



3 This is a standard way to axiomatize first order logic with equality and function symbols. See, 
for example, [Bell fc Machover 77]p.l08. 

4 0r, put slightly differently, the -procedures preserve the semantics of the representations. 
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Notice that if B ^ C in the example above, stating that < A,B > and < A, C > are 
couples is contradictory. In this case, the structure used to represent couples must 
signal a contradiction to enforce its constraints. In general, when a representation 
captures a constraint, it signals a contradiction if a collection of facts is stated that 
violate the constraint. 

One representation of a problem is better than another when it captures more of 
the constraints on the problem's concepts. For example, consider two different rep- 
resentations of a problem in first order logic: In the first, the concept "mother" is 
represented as the binary relation mother, while in the second it is represented as the 
function mother-of. A problem statement using the relation representation might 
contain the statements: 

mother(A, B) 

\/x\/y\/z[mother(x,y) A mother(x, z) => y — z\. 

In a representation in which "mother" is represented as the function mother-of, the 
same information expressed in the two statements above is expressed in the single 
statement mother-of(A) = B because the single valued property of "mother" is 
captured by mother-of, i.e., the general statement above reformulated in terms of 
mother-of follows from the properties of functions in first order logic. 

Because better representations capture more constraints, they aid in problem solving 
by reducing the search space that a problem solver must consider. To illustrate, 
let us consider solving a very simple problem in two different representations, one 
capturing more constraints of a problem than the other. Suppose we are using a 
resolution theorem prover as our problem solver. In the first representation, the 
problem of interest is expressed in terms of, among other things, a relation R that is 
single valued. That is, the problem statement contains: 

\JxVyVz[R{x,y)AR(x,z)^y = z} (1) 

Suppose further that in order to solve the problem the theorem prover must find a 

contradiction in the two statements: 

VxVz3y[R{x,y),\S(y,z)} ' (2) 

Vx\/z3y[R(x,y)A-S(y,z)} (3) 

Finding the contradiction requires using the fact that R is single valued — that for 

every x there can be at most one y such that R(x,y) -, — to determine that the negative 
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and positive occurrences of S resolve. The solution path for a simple resolution 
theorem prover will have length greater than or equal to four to reach this conclusion 
using (1) above. For the interested reader, Figure 1.7 gives the clausal form of this 
problem and one possible shortest solution. 

The clausal form of the problem: 

(l'hR(x 3 ,y 3 ) V -.fl(x 3 , z 3 ) V j/3 = z 3 

(2')R(x 1 ,a(x l )) 

(2")5( ( t(x 1 ),z 1 ) 

(S')R(x 2 ,t(x 2 )) 

(rhS(r(x 2 ),z 2 ) 

Note that a and r are skolem functions. 

Here is one possible shortest solution path: 



Conclusion 



(4) -iR(x 1 ,z 3 )Va(z 1 ) = z 3 

(5) <t{x 2 ) = t(x 2 ) 
(6)S(t( Xi ), Zi ) 

(?)□ 



Justification 



(2') and (1') 

(4) and (3') 

(5) and (2") 

(6) and (3") 



Each of the steps, except (6), is an application of the binary resolution rule. Step (6) 
requires a special inference rule for equality such as paramodulation. 

Figure 1.7: A proof with the "single valued" property stated as an axiom. 

In the second representation, the concept "R," which was represented as a relation, 
is represented as the function F. In this representation: 

• the constraint of (1) is enforced by the theorem prover 's unifier without the 
need for any explicit axiom 

• the problem formulation becomes simply: 

S(F( Xl ), Zl ) 
-S(F(x 2 ),z 2 ) 

• and the required inference is reduced to one resolution step. 



1.2.2 A Model of Representation Design 

In this thesis, representation design is viewed as a process of capturing constraints. 
The input to the process is a problem statement, the representation mapping induced 
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by that statement (i.e., the structure representing each concept), and the meaning 
postulates for a set of available structures. The system tries to select new structures 
to represent the problem concepts so that the constraints on those concepts are 
captured. As a simple example, suppose that a problem statement contains 

mother(A, B) 

\/xVy\/z[mother(x,y) A mother(x,z) =s> y = z\ 

and that the available structures are those of first order logic with equality and func- 
tion symbols. Then the system will change the representation of "mother" to a func- 
tion by uniformly rewriting the atomic formula mother(x,y) as y = mother-of(x). 
The new statement of the problem is 

B = mother(A) 

VxVy\/z[y = mother-of(x) A: = mother- of(x) => y = z] 

and the general statement follows from a suitably instantiated version of the meaning 

postulate for functions, i.e., 

Vx\/y[x = y => mother-of(x) — mother -of(y)]. 

Therefore, the problem's general statement can be removed, leaving the problem rep- 
resentation as simply B — mother(A). For larger problems, the process considers 
each concept in turn, attempting to capture its constraints by redefining its repre- 
sentation. 

The earlier discussion about the properties of good representation distinguished be- 
tween capturing constraints in structure and behavior. The process described above 
is not concerned with whether constraints are being captured in structure or behav- 
ior, only with whether they are being captured. The representation design system's 
ability to design good representations is, therefore, dependent on the quality of the 
structures available to the constraint capturing process: they must be efficient com- 
binations of structure and behavior. This is one reason that the research is called 
representation design. Like other designers, this process can assemble primitives 
(structures) to meet design specifications. However, a precondition for the process 
generating good designs is that the primitives be good. 

Other desiderata for design primitives also apply to the available structures for rep- 
resentation design. For example, one desirable property of a collection of primitives 
is orthogonality. In thecase of representation design, orthogonality amounts to the 
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collection of structures being able to capture a wide variety of constraints. This is 
not the case, for instance, in the example above where the only available structures 
are relation and function. As we will see, the representation design system described 
in this thesis relies on a larger, more varied collection of structures. 

1.2.3 Representations and Problem Classes 

One question that arises about the process described in the last section is: Should 
we really try to design a representation that captures every constraint of a problem, 
no matter how serendipitous? The advantage of doing so is that a representation 
capturing all of a problem's constraints can take advantage of every peculiarity of a 
problem to gain efficiency. However, when a representation captures every constraint 
of a problem it can not be used to solve any other problems. 

Given a problem, my system attempts to design a representation that captures all 
of the constraints of that problem 's class. Intuitively, two problems are in the same 
class when the same general constraints are relevant to solving them. They can differ 
in the individuals they mention, in the particular relationships between those indi- 
viduals, or in which individuals are constrained in a particular way. For example, a 
problem in the same class as the FAMILIES problem would refer to the same col- 
lection of concepts (i.e., married, child, grandchild, niece, mother, sister, brother, 
grandfather, sibling ). However, it could mention a different collection of individuals 
and could have a different number of individuals being married. 

Note that since problem statements may have function symbols in them, some re- 
lationships between individuals may be specified as equalities between terms, e.g., 
mother(A) = B. Since we want to allow different problems in the same class to have 
different relationships between individuals, we must allow problems in the same class 
to have statements that equate different terms. 

In defining problem class more formally, problem statements are divided into three 
types: those that we term specific because they mention only individuals (these in- 
clude existentially quantified variables that are not in the scope of any universally 
quantified variables); those that we term general because they do not mention in- 
dividuals (i.e., have only universally quantified variables or existentially quantified 
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variables in the scope of universals); and those that we term mixed because they 
contain both individuals and universally quantified variables. 

To capture the intuition that specific individuals and relationships do not affect a 
problem's class, we abstract a problem to a class description as follows: 

1. Generalize the specific and mixed statements by replacing named individuals 
with existentially quantified variables. For example, 

\/x^br other (M, x) 

to 

3pix-^br other (p, x). 

2. Break up conjunctive specific and mixed statements. For example, break up 
the statement 

3x3y3z'br other (x,y) A married(y, z)\ 
as 

3x3y[br other {x , y)] 

3x\ 3yi [married(xi, y% )} . 

3. Remove specific statements that are equalities. 

The resulting set of statements is a class description in the sense that we could use it 
to generate problem instances by choosing different individuals for the existentially 
quantified variables. More precisely, we can create an instance as follows. For each 
existentially quantified statement, create one or more statements by substituting in- 
dividual names for the existentially quantified variables. Then add equalities between 
terms in the new problem statement. 

There are good reasons for designing to a problem class instead of attempting to 
design to a particular problem. The efficiency of the resultant representation for 
problem solving is very close to that of a representation that captures all the con- 
straints of the specific problem. This representation is also reusable. 

Although the definition of problem class does not explicitly refer to a problem's 
query, problem class is defined in terms of what is relevant to solving a problem. 
This depends on what is being asked. The system attempts to restrict its description 
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of a problem class so that it contains only statements that are relevant to solving 
a problem by eliminating statements that are irrelevant before it derives the class 
description. A precise definition of irrelevance and the system's technique for elimi- 
nating it are discussed in Chapter 3. 

After eliminating irrelevance from a problem statement, the system creates two sets 
of statements. One set contains the specific and mixed statements of the problem. 
The other set contains a description of the problem class, i.e., the specific statements 
are generalized to replace named individuals by existentially quantified variables. 
The system then attempts to design a collection of structures that capture all of the 
constraints of the problem class. 

Representation design continues until all of the constraints in a class description are 
captured or until the system can capture none of the remaining constraints. 

To solve the problem, the specific and mixed statements are expressed in the designed 
representation creating instances of the structures designed from the class description. 
Let us call this set of structures, a problem situation. Because the constraints of the 
class are captured by the representation, those constraints are true of every problem 
situation that can be built with that representation. Often a constraint is captured 
by responding to the addition of new facts to a situation by adding still more facts. 
For example, the representation designed for the FAMILIES problem captures the 
symmetry of "married" and, therefore, when a specific statement like married(N, P) 
is expressed in this representation, the symmetry of "married" is enforced. As a 
result, the situation containing married(N,P) will also contain married(P,N). 

1.2 A Implementation of Representations 

As noted earlier, a representation is a mapping between concepts and structures 
with behavior. The representation design system implements structures as abstract 
data types (ADTs) or objects in the object programming sense. The data structure 
part of an ADT is used for implementing structure and the procedures are used for 
implementing behaviors. Access to the data structure "inside" an ADT is controlled 
by its procedures, preventing arbitrary manipulations of the data structure. The 
procedures of the ADTs that my system designs enforce constraints by maintaining 



1.2, OVERVIEW OF REPRESENTATION DESIGN 23 

invariants in their data structure as information is added to problem situations. 

For example, when "married" is represented as a relation, the system designs an ADT 
in terms of sym-rel. An instance of this ADT , denoted married, maintains a list of 
the pairs of individuals that are known to be married in a problem situation. Married 
captures the symmetry of "married" because is has a procedure that adds < x,y > to 
the list in married whenever < y, x > is added. The sym-rel ADT controls access 
to its internal list so that it is impossible for < x,y > to get on the list without 
< y, x > also appearing there. Thus, using this ADT, no problem situation can be 
constructed which violates the symmetry constraint. 

In general, there are three different kinds of procedures associated with ADTs. The 
first kind add information to problem situations by adding to the ADT instances or 
creating new instances. For example, one procedure adds new pairs of individuals to 
married. The second kind of procedure enforces constraints (maintains invariants). 
The procedure associated with married that enforces symmetry is an example of 
this. The third kind answers queries by inspecting the structures built by ADTs. For 
example, another procedure associated married checks to see whether two individuals 
are married in a problem situation by searching the list for a particular pair. 

1.2.5 Knowledge about Representations 

The system synthesizes representations from a library of prototypical structures, or 
classes in the object oriented programming sense. For example, there is a library 
prototype called relation. Representations of particular relations are created by 
instantiating the prototype relation. For example, married is an instance defined 
in terms of relation. Here is its definition: 

married: relation(family-member, family-member), 

where family-member is the representation of the sort family-member. 

The relations, functions, and individuals found in a problem are represented ini- 
tially by instantiating the library prototypes relation, function, and individual 
respectively. 

Let us now be specific about the typographic conventions that have been used. They 
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are summarized in Figure 1.8. 
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Figure 1.8: Typographic conventions 



1.2.6 The Goal of Representation Design 



The next four sections outline the four processes of representation design. The pro- 
cess as a whole tries to design representations that are complete and maximally con- 
strained. Each of the four processes contributes to this goal in different ways. 

A representation is complete with respect to a problem class just in case every problem 
situation in the class is representable. We assume that first order logic is complete, 
thus initial problem representations are always complete. The system must be sure to 
preserve this property as constraints are captured in a representation being designed. 

A representation is maximally constrained with respect to a class if it captures all of 
the constraints of the class. This is, of course, what the representation design system 
tries to achieve. 

The system is not always successful at designing a maximally constrained represen- 
tation. Fortunately a representation that captures most of a problem's constraints is 
still useful because, we are still better off than we were with the initial representation. 
The reason is that as constraints get captured in a representation, the space of pos- 
sible situations that must be searched by a problem solver using that representation 
is reduced. The result of representation design is a more constrained representation 
and a smaller collection of statements (the uncaptured ones) that the problem solver 
must reason about explicitly in that representation. 5 



5 In effect, the problem solver uses the constrained representation to accelerate the problem solving 
process in the same way that specialized reasoners have been used to accelerate theorem proving. 
There are issues involved in the interface between a theorem prover and representations of the sort 
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1.2.7 Deriving an Initial Representation 

Problems are stated in a sorted logic. 6 One possible description of the initial rep- 
resentation is the collection of data types that are isomorphic to the sort signatures 
of the problem concepts, e.g., since the initial statement of the FAMILIES problem 
contains N : family-member, married : family-member x family- member, the 
initial representation of the FAMILIES problem would contain 
(deftype family-member specializes individual) 

married: relation(family-member, family-member). 
The resulting data types are a description of a complete representation because every 

problem situation can be expressed using them However, subsequent processes will 

design representations for all the concepts that appear in this description and some 

of these concepts may be unnecessary. 

Instead of including all concepts, the system attempts to develop a description of 
the smallest set of concepts that constitute a complete representation. There are 
two types of concepts that the system attempts to exclude in this process: irrelevant 
concepts and redundant concepts. 

Problems often contain concepts that are not relevant to their solution. We do not 
want to design representations for irrelevant concepts. Therefore, the system attempts 
to exclude irrelevant concepts from its description using a technique described in 
Chapter 3. 

Problem statements also, at times, contain redundant information. For example, it 
is usually redundant to capture constraints on a defined concept if we have captured 
all the constraints on the concepts in its definition. For instance, if we design a 
representation to capture all the constraints on child, it is not necessary to design a 
separate representation for grandchild because it is denned in terms of child. 

While deriving the description of the initial representation, the system tries to identify 
denned concepts and exclude them from the description. 7 However, a subsequent 



that the system designs which I have not yet attempted to work out. However, the approach used 
in [Miller & Schubert 88] should work here. 

8 However, a problem's sort information may be incomplete. Chapter 3 describes a simple tech- 
nique that the system has for computing sort signatures from partial information. 

7 Sometimes the choice of whichconcepts are primitive and which are defined is arbitrary. For 
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process of representation design may decide to represent a concept that is excluded 
at this point. The techniques that the representation design system uses to identify 
defined concepts are discussed in Chapter 3. 

Once the representation design system has identified a set of concepts that it believes 
is the smallest complete representation, it uses the associated data types as its initial 
description. While complete, this representation does not capture any of the con- 
straints of the problem class, i.e., situations can be created that violate any of the 
statements in the class description. The rest of the processes of representation design 
work to capture all the constraints of a problem class. 

The next three sections describe the classification, concept introduction, and oper- 
ationalization processes which attempt to design maximally constrained representa- 
tions. Classification and concept introduction run as coroutines to capture as many 
of the constraints of a problem as possible by selecting different library structures to 
represent problem concepts. These processes take three inputs: the class description 
(derived from the problem statement after irrelevance is eliminated), the description 
of the initial representation, and axiom schemas defining the invariants on the li- 
brary structures (ADTs). They produce two outputs: a description of the designed 
representations and the statements in the class description that are not captured by 
the representation. Operationalization then tries to capture the constraints of any 
remaining statements by writing new procedures and using these to further constrain 
the representations created by classification and concept introduction. 

1.2.8 Classification 

The representation design system captures constraints on a concept by representing 
it in terms of library structures enforcing those constraints. The structures that 
together capture the most constraints on a concept are identified by classifying that 
concept in a hierarchy of the library structures. The hierarchy is organized around 
the constraints that the structures enforce (see Figure 1.9). Structures enforcing 



example, given the statement 

\/xVy[child(x, y) o parent{y, x)} 
and no other general statements about either concept, one is arbitrarily considered to be defined 
in terms of the other. 
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additional constraints specialize structures enforcing fewer constraints. For example, 
the structure 1-1 function specializes function because it enforces the additional 
constraint, "one-to-one"(i.e., f(x) = f(y) => x = y). 
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Figure 1.9: Part of the structure taxonomy 

The system classifies concepts in the hierarchy, using their definitions in the initial 
representation to determine where to begin. For example, the representation for 
married is initially defined as 

married: relation(family-iriember, family-member) 

so the classification of married begins at the relation node. 

During classification, the constraints labeling links in the hierarchy are treated as 
properties to show of concepts. As it reaches each node in the hierarchy, the system 
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looks at the properties labeling links leaving that node, trying to determine which 
properties the concept has. For example, the classification of married begins with 
the system looking at the links leaving the relation node and tries to determine if 
married is unary, binary, or n-ary. It is found to be binary from its initial represen- 
tation. 

The hierarchy is also annotated to show which properties leaving a node are disjoint 
(the arcs connecting links in Figure 4.1). For example, unary is annotated as disjoint 
from binary. Classification at a node is complete when the system has examined each 
disjoint group of properties at that node and either it has identified one property 
from the group or it has determined that the concept being classified has none of 
the properties in that group. Classification then continues with each of the nodes 
below the properties identified. For example, after married is classified as binary, 
classification continues with the node labeled bin-rel; when the system determines 
that married is irreflexive, symmetric, and antitransitive, classification completes at 
this node. 

Classification treats unlabeled links differently. When classification completes at a 
node with unlabeled links leaving it, those links are marked traversable. Classification 
does not proceed down an unlabeled link until all other links entering the node 
below are marked traversable. For example, classification will only proceed from the 
node labeled sym-rel to eq-rel when the links leaving the nodes labeled sym-rel, 
ref-rel, and trans-rel have all been marked traversable. 

To determine whether a concept has a property, the representation design system 
looks for a statement in the problem that indicates that the property holds. For 
example, to determine that married is symmetric, the representation design system 
looks for a statement of the form: 

\/x\/y[married(x,y) — ■ married(y.x)\. 

This technique by itself often fails because properties can be stated in many ways. 
The system also has a heuristic technique (discussed in Chapter 4) that tries to 
transform statements into a form that classification is expecting. Since the technique 
is heuristic, the system may fail to determine that a concept has a property even when 
it is stated. 
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When the representation design system is unable to show that a concept has any of 
the properties in a disjoint group leaving a hierarchy node, it asks the user about each 
property in turn. For example, if it is not able to determine whether or not married 
is symmetric, it asks the user (presumably the answer is "yes"). If the answer were 
"no," the system would ask whether it is antisymmetric. 

When the system obtains positive information about a property from the user, it adds 
the statement of the property to the problem. For example, when the user replies 
that married is symmetric, the system adds the statement (shown above) expressing 
that fact. 

Perhaps more interesting that the classification algorithm itself is the library of struc- 
tures that it uses (part of which is shown in Figure 4.1). This library was compiled 
from my observations of the "structures" that people use in their diagrams when 
solving analytical reasoning problems. It turns out that these structures are partic- 
ularly efficient combinations of structure and behavior. In addition, the constraints 
that they capture are applicable to a large number of analytical reasoning problems. 
For example, even though capturing a constraint generally requires procedures that 
respond to new specific facts by adding still more specific facts, clever use of structure 
can avoid this. In the FAMILIES problem, for instance, the structure designed for 
"married" captures the symmetry constraint in this way. Married couples are repre- 
sented as sets of size two. This avoids having to do any extra work to enforce the 
symmetry constraint because, using the fact that {x,y} = {y, x}, the combination of 
the structure and procedures associated with the set representation make it the case 
that the statement married(N , P) is the same as married(P,N). 

As constraints get captured during classification, statements expressing those con- 
straints are identified by a process called capture verification, then removed from 
the class description. Whenever classification of a concept terminates, capture ver- 
ification checks the general statements referring to the concept to see whether their 
constraints have been captured. The intuition behind capture verification is that a 
constraint is captured by a representation if it is true in every situation that can be 
created in that representation. Statements are checked by attempting a constructive 
proof using the representations of the concepts in the statement to build situations, 
i.e., the abstract data types designed are used to build data structures. 
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For example, to verify a conditional statement, capture verification constructs a prob- 
lem situation representing the antecedent of the statement and then checks that sit- 
uation to see if the consequent is true. If so, then the representation captures the 
constraint of that statement. 8 For instance, consider how capture verification checks 
the statement 

\fxiy[married{x,y) => married(y,x)}. 

A situation is created representing the antecedent, i.e., a situation in which one 
anonymous individual, say .4, is married to another, say B. Then the situation is 
inspected to see if it is also the case that B is married to A. When married couples 
are represented as sets of size two, this proof will succeed. In this case, the system 
concludes that the constraint of the statement is captured. 

1.2.9 Concept Introduction 

Classification has a serious limitation: its success depends on the particular vocabu- 
lary used to state a problem. The FAMILIES problem, for example, is stated in terms 
of married, which is classified beginning with relation. None of the specializations 
of a relation capture the fact that married couples are all of size two. However, 
if the problem had been stated in terms of couples, classification would have been 
more successful because a couple is a specialization of set that takes advantage of 
size constraints. Concept introduction enhances classification in this example by in- 
troducing couple when it detects that the size constraint on married is not captured 
by married. This illustrates that, given a different vocabulary, classification is more 
successful because it allows different and sometimes more specialized knowledge in 
the structure library to be brought to bear in representation design. 

Introduction extends the classification process by adding a new concept to a problem 
when classification of an existing concept does not capture all of its constraints. 
The introduced concepts allow the system to view an existing concept differently. 
For example, introducing couple for married allows the system to view "married" 
differently. 



8 More generally, a disjunction is verified by creating a situation in which all but one of the 
disjuncts is false and then checking that the last disjunct is true. 
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When a new concept is introduced, a corresponding representation is also introduced. 
For example, couple is actually introduced in several steps. I will use the first step to 
illustrate the parallel introduction of new concepts and representations. In the first 
step, the concept spouses is introduced. This is a function mapping individuals to the 
sets of individuals who are their spouses. A representation, spouses is also introduced 
and defined as a function mapping an individual to a set of individuals. 

A new concept is always defined in terms of an existing concept in such a way that 
the semantics of a problem are not changed. For example, spouses is defined as 

VxVj/[y € spouses(x) <^> married(x,y)}. 

There are two reasons for introducing new concepts. The primary reason is to give 
the representation design system access to different representation design knowledge: 
the library structure representing the new concept captures different constraints and 
has different specializations. In the example above, for instance, since married does 
not capture all the constraints on "married," spouses is introduced to gain access to 
representations that capture different constraints. 

The other reason to introduce a new concept is that it enables reformulation of the 
statements in the problem. This is useful because it often allows new properties to be 
recognized in the problem statement. Reformulation is accomplished by treating the 
logical definition of a new concept as a rewrite to perform on the problem statements 
to derive new statements. For example, when spouses is introduced, it is defined as 

Vx\/y[y 6 spouses(x) c=- married(x : y)}. 

This is treated as a rewrite rule that derives a statement referring to spouses for 
every statement referring to married. For instance, this rule derives the statement 
P £ spouses(N) from married(N.P). 

As new concepts are introduced, the representation design system explores a space of 
alternative problem formulations. For example, after spouses is introduced, there are 
two formulations: one in terms of married and one in terms of spouses. The system 
has a technique for estimating the cost of problem formulations so that alternatives 
can be compared (discussed in Chapter 5). Alternative formulations are maintained 
because the cost estimation technique works only when all the relevant alternative 
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concepts have been fully classified. For example, the formulation of the problem in 
terms of married can not be compared to the formulation in terms of spouses until 
spouses is fully classified. 

Classification extended by introduction is called extended classification. Extended 
classification is one of the most interesting aspects of this research. While the two 
processes involved in it are fairly simple, the behavior of the combination can result 
in sequences that significantly reformulate a problem. Several examples of extended 
classification are given in Chapter 5; one example shows how extended classification 
of married results in the following sequence of introductions: 9 

1. The concept spouses is introduced. This is a function from individuals to the 
sets of individuals to whom they are married. 

2. The concept non-empty- spouses is introduced. This is a partial function from 
individuals to the non-empty sets of individuals to whom they are married. 

3. The concept spouse is introduced. This is a partial function that captures the 
fact that individuals have at most one spouse. 

4. The concept couple is introduced. This is a partial function from individuals 
to the married couple that they are members of. Couple captures the following 
facts: not all individuals are married, each married couple is disjoint from all 
other married couples, married couples contain exactly two members. 

The discussion so far has described how extended classification designs representa- 
tions that capture general statements. The introduction process is also used to cap- 
ture a special kind of mixed statement called a restriction. A restriction is a mixed 
statement that restricts the individuals that can stand in some relation to a particular 
individual. Here are two examples of mixed statements that are restrictions: 

\/x-*br other (M, x) 

(restricts the individuals that can be brothers of M ), 

Vx[child(Q,x) <3> x = R] 

(restricts the individuals that can be children of Q). 



9 Note that the names that the system makes up for introduced concepts are gensyms. For clarity 
of presentation, meaningful names have been substituted for these throughout this document. For 
example, I have substituted spouses for the meaningless name that the system gives to a concept it 
introduces for married. 



1.2. OVERVIEW OF REPRESENTATION DESIGN 33 

By contrast, the statement 

\/x brother(\l, x) 

is not a restriction because it does not restrict the brothers of M. 10 

During extended classification of a concept, the system tries introductions that re- 
formulate a problem so that restrictions become specific constraints. For exam- 
ple, during the extended classification of the relation brother, the statement, 
Vx-ibrother(M, j), is captured by introducing the function brothers that maps from 
an individual to his/her set of brothers and rewriting statements referring to brother 
so that they refer to brothers instead. This reformulation transforms the above state- 
ment into brothers(M) = 0. Notice that this is a specific statement, i.e., it does not 
contain any universally quantified variables. Similarly, the statement 

"i x[child(Q , x) <==> x = R) 

is captured by reformulating child as a function children that maps individu- 
als to their sets of children. Then this statement becomes the specific statement 
children(Q) = {R}. 

There are two reasons for handling restrictions in this way. First, the reformulated 
representation is never less efficient than the original. Second, it results in a type 
of generalization of the representation by introducing sets. If a problem contains 
a statement restricting the number of individuals that can stand in some relation 
to a specific individual, we would like the system to design a representation that 
is not dependent on the number of individuals in the given statement. This way 
the representation can be used for another problem that has a similar constraint, but 
differs in the number of individuals involved. For example, we want the representation 
designed for a problem containing the statement 

\/x[child(P,x) -=> x = R\ 

to also work for a similar problem containing 

Vx[child(P,x) ^> x = R V x = S], 



10 Note that there is no guaranteed syntactic technique for establishing that a statement is a 
restriction. Therefore, there is no guarantee that all restrictions will be reformulated as described 
here. 
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and so on. 

The system accomplishes this by generalizing the sets introduced when restrictions 

are reformulated. For instance, to accomplish this in the above example, the system 

introduces the function children, a mapping from an individual to his/her set of 

children, then reformulates. The reformulated versions of the two statements above 

are 

3y 1 3z l [children(y 1 ) = {z x }) 
3y 2 3z 2 3w 2 [children(y 2 ) = {z 2 , w 2 }] 

which the system generalizes to 

3y3z children(y) = z, 

where z is taken to be a set. Hence, a representation is designed to allow statements 
restricting the set of an individual's children to a constant set of any size. 



1.2.10 Operationalization 

In the ideal case, extended classification maximally constrains the initial represen- 
tation. Given a fixed collection of concepts, classification comes up with the most 
constrained collection of representations, while introduction modifies the collection 
of concepts when they fail to capture all of a problem's constraints. The success of 
extended classification depends on the collection of library structures and it can fail 
to produce a maximally constrained representation because there are many combi- 
nations of constraints that the library structures can not capture. When extended 
classification fails, the representation design system makes a final effort to maximally 
constrain the representation through a process called operationalization. 

Operationalization tries to capture constraints by writing procedures that respond 
to any new information added to situations, in order to enforce the constraints of 
statements. Each procedure written for a problem statement responds to the addition 
of information that can violate the constraint by adding still additional information 
to reestablish it. 

Consider an example: Suppose that in designing a representation for the FAMILIES 
problemrextended classification .produces the concept siblings, a function from indi- 
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viduals to sets of individuals. Suppose further that the following statement has not 
been captured, 

\/xVy[x E siblings(y) §j£ siblings(x)}. 

Operationalization will write a procedure that adds y to the siblings of x whenever 
x is added to the siblings of y. This new procedure captures the constraint of this 
statement whenever an element is added to a set of siblings in a problem situation. 

Notice that if this operation were the only one that could change sibling sets, then 
the procedure described would capture the constraint of the statement above because 
it would ensure that no problem situation could be created in which y is a member 
of the siblings of x unless x is also a member of the siblings of y. 

Of course, in general, there are different ways to add information to a problem situa- 
tion. Therefore, operationalization must generate a procedure like the one above for 
every different way that information affecting a statement's constraint can be added. 

It turns out that the ways information can be added to a problem situation are 
restricted by the kinds of operations that can add information. Operations that can 
add information are restricted by the procedures that already exist for the abstract 
data types implementing the problem representation. 

For the problems that the system has been tested on, operationalization has al- 
ways succeeded in capturing all constraints left uncaptured by extended classifica- 
tion. However, there is no guarantee of this in general. As noted, a representation 
that captures most of a problem's constraints is still useful in accelerating a problem 
solver's reasoning. 



1.3 Analytical Reasoning Problems 

The current system designs representations for analytical reasoning problems. These 
problems appear on intelligence tests like GREs and LSATs. They are intended 
to test a person's ability to "draw logical conclusions from information presented 
and to synthesize that information in order to deduce the actual structure of or 
interrelationships among things." [Weber83j 
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Analytical reasoning problems present a collection of facts about a specific situation 
and then ask questions that require deduction from those facts. They never ask 
questions requiring induction from a set of facts. There are four types of questions: 

1. Is some specific fact or restriction true of a problem situation? For example, 
"Is it true that M has no brothers?" 

2. Is it possible for some specific fact or restriction to be true in a problem situa- 
tion? For example, "Is it possible for N to be the brother of M in the current 
situation?" 

3. Find all the individuals that stand in some relation to a specific individual in a 
problem situation. The FAMILIES problem question is an example of this type: 
"Find all the individuals that are siblings of S." This type of question may also 
ask if the individuals found are all the individuals that can have this property or 
are just all that can be found in the problem situation. For example, a question 
may ask us to find all the siblings of an individual and then ask whether it is 
possible for there to be others. 

4. Find the individual that stands in some relation to a specific individual. This 
question assumes there can be only one such individual. For example, "Find 
the mother of S." 

The representation design system accepts problems with the types of queries described 
above and designs representations specifically to answer those queries. A problem 
query can influence representation design in several ways. For example, the system 
usually reformulates problems so that find-all queries are translated into find-the 
queries because find-the queries can be answered more efficiently. For instance, the 
FAMILIES problem is initially stated in terms of the relation sibling. Given the 
query, ''find all the siblings of S," the system reformulates the problem in terms of 
siblings, a function mapping an individual to his/her set of siblings so that the query 
becomes, "find the sibling set of S." 

The system also reformulates problems that have queries asking about mixed facts 
because these, too, are often very expensive to answer. For example, if instead of 
the statement \/x^brother{M,x), the FAMILIES problem contained the query, "Is it 
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necessarily the case that M has no brothers?" the system would still reformulate the 
problem in terms of brother sets. 

Even though analytical reasoning problems do not require it, the system can also 
answer necessity and possibility queries about general facts. We now explain how 
each of the different types of queries is answered for statements of the form 



11 



<Pi =* ■■■(<£„ => (Vi A-.-A^m)). 

Queries are converted to one or more statements in this form when they are not 
stated as such. 

1. To determine if some unconditional (i.e., n = 0) specific fact is true in a problem 
situation, the system inspects the situation to see if the fact is present. If it is, 
then the system answers yes; otherwise it responds that it does not know. 

To determine if a conditional specific statement is true, the system extends the 
situation adding facts that correspond to the antecedents (i.e., each of the <£,•) 
and then, if the consequent (i.e., the conjunction of the 0,-) is true, it answers 
yes. For example, suppose the system is asked to show that the following fact 
is true: 

married(A.B) =>• sibling(A,C), 

where A, B, and C are individuals in a situation. It does this by adding 
married(A,B) to the situation. If sibling(A,C) is true in the new situation, 
the system answers yes. Otherwise, it says that it does not know. 12 

To determine if a statement involving universal quantification is true, the system 
follows the procedure used by capture verification. To check an unconditional 
statement, the system extends the situation by creating anonymous individuals 
for each of the universally quantified variables in the statement. If the uncon- 
ditional statement is true of those individuals in the new situation, the system 
answers yes. 



Note that we could just as well have used disjunctive normal form, i.e., we could replace this 
statement with m statements of the form 

-^ v • • • v -.<£„ v 4> { . 

However, it is more natural to explain representation design for statements in this form. 
As explained in Chapter 4, there is one other case. The system also answers the question yes if 
the situation containing married(A, B) is contradictory. 
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To check a conditional statement, the system proceeds as though checking a 
specific conditional statement except that it creates anonymous individuals for 
the universally quantified variables. 13 For example, suppose the system is asked 
to show that the following statement is true in a situation: 

Vx[child(S, x) <& (x = AV x = B V x = C)\. 

This is the question, "Are A, B, and C the only children of S?" To establish this, 
the system converts it to two statements (in the form described above) corre- 
sponding to the two directions of the biconditional. Then it does a constructive 
proof corresponding to each of these statements. 

2. To determine if it is possible for some unconditional specific fact to be true in 
a problem situation, the system adds that fact to the situation. If a contra- 
diction results, the system concludes that the fact is not possible, otherwise it 
responds that it does not know. A similar procedure is included for conditional 
specific facts. As with necessity questions, possibility queries with universal 
quantification in them are handled by replacing the variables with anonymous 
individuals. 

3. To find all the individuals that stand in some relation to a specific individual, the 
system searches the problem situation looking for all facts relating individuals 
to the specific individual. To determine if the individuals found are the only 
individuals that can stand in that relation to the specific individual, the system 
constructs and attempts to prove an appropriate mixed query. For example, 
suppose the query is "find all the children of S" and the system finds that A, 
B, and C are all children of S. Then, to answer the second part of the query 
it tries to prove that A, B, and C are the only children of S, as described in 1 
above. 

4. A find-the query can only ask for the value of a functional expression like 
"find-the sibling s(S) .'' These questions are answered by retrieving from the 
problem situation the image of an individual under the function. For example, 



13 If there are universally quantified variables appearing in the consequent that do not appear in 
the antecedent, then corresponding anonymous individuals are created and the system checks that 
the consequent is true for these. 
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the query, "find-the siblings(S)" is answered by retrieving from the problem 
situation the image of S under siblings (the representation of siblings). 

As noted above, the representation design system can transform find-all queries 
into find-the queries by reformulating a problem. The reformulation has the 
effect of adding a new kind of individual to the problem. For example, the 
query "find-all x \ sibling(S, x)" is transformed into "find-the sibling s(S)," 
where the range elements of siblings are sets of individuals. One advantage 
of reformulating in terms of a find-the query is that sets can be marked to 
indicate that all members are known. This makes answering the second part of 
a find-all query much easier. 

1.4 Soundness of Representation Design 

One question that arises about the representation design process concerns how much 
faith we should place in the answers that we get with representations designed by 
the system. This is the question of whether or not the representation design process 
is sound. 

I have shown that if the representation design system produces a fully constrained 
representation and that representation halts while building a problem situation, then 
the answers produced with that representation are always answers to the original 
problem. 

There are two kinds of lemmas that must be proved to obtain the soundness result. 
First, since representation design changes problem statements, we must prove that 
all transformations done on the predicate calculus problem statement are sound. Of 
particular interest is the process of introduction because it adds new concepts to a 
problem. Chapter 5 gives a soundness condition for introductions. I have shown that 
all introductions in the system meet this condition and, therefore, the introduction 
process as a whole is sound. 

The purpose of the second kind of lemma is to show that the process of capturing 
constraints in a representation is sound. In particular, we must show that when the 
constraint of a general statement is capt-ured in a representation, the representation 
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can be used only to create situations in which the constraint is true. 

1.5 Scope and Limitations 

The representation design system is knowledge based and, as with other such sys- 
tems, it is very difficult to formally characterize its scope of applicability. This 
section, therefore, attempts to converge on this issue from two perspectives. First, 
we give evidence indicating that the current implementation provides the framework 
and much of the knowledge required to handle most analytical reasoning problems. 
Second, we roughly characterize the scope of applicability of the current system and 
give examples of problems to which the system is not applicable. 

1.5.1 Coverage 

Section 1.3 provided an informal characterization of the class of analytical reasoning 
problems. I believe that the current system can easily be extended to provide rea- 
sonably high coverage of these problems. I characterize the coverage of the existing 
knowledge by answering the following two questions: First, how well does the exist- 
ing library cover a large body of analytical reasoning problems? Second, how much 
overlap is there in the knowledge used on different problems? 

There is evidence to suggest that the current structure library population is very 
close to a sufficient collection for designing representations for most analytical rea- 
soning problems. The current system was designed after a survey of approximately 
two hundred analytical reasoning problems. From these, I compiled a set of twenty 
representative problems. From the representative set, I chose three problems that I 
found among the most difficult to solve. I call these the paradigm problems. I then 
studied the representations that I and two other people used in solving the paradigm 
problems. The current structure library population is the result of that study. 

Examination of the other seventeen representative problems showed that the existing 
structure library was sufficient: the problems did not use any additional structures. 

This research would not be interesting if each problem solved used a disjoint subset of 
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the knowledge. This has proven to be far from the case. The current representation 
design system generates representations for the three paradigm problems (shown in 
Figure 1.2, Figure 1.5, and Figure 1.6) and three variations (twelve problems total). 

Even though the paradigm problems appear quite different, the representations gener- 
ated for them use only structures from the existing library. Furthermore, Figure 1.10 
shows that most of the structures are used in more than one problem. Thus, there 
is significant overlap in the structure library knowledge used in designing representa- 
tions for the paradigm problems. 



Problem 


Structures Used 


FAMILIES 


function, 1-1 function, 




partial-function, partial- 1-1-f unction, 




disjoint -set, fixed-size-disjoint -set 


WAITERS 


function, fixed-size-set, set 




antitrans-antisymm-irref-rel 


PROFESSORS 


1-1-function, 1-1-partial-function, set, 




fixad-sizfl-sflt 



Figure 1.10: Library structures used in each paradigm problem. 

There is also overlap in the introduction knowledge used in generating representations 
for these problems. There are a total of eleven rules in the introduction knowledge 
base. Figure 1.11 shows the number of different rules used in designing each paradigm 
problem and the number of total rule firings that occurred in the design efforts. All 
of the rules in WAITERS were also used in the other two problems. Four of the rules 
in PROFESSORS were used in FAMILIES. 



Problem 


# Different Rules 


# Total Rule Firings 


FAMILIES 

WAITERS 

PROFESSORS 


9 
3 
5 


31 

7 
11 



Figure 1.11: Data on introduction rule usage in paradigm probl 



lems. 
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1.5.2 Scope of Applicability 

The current version of the system designs representations for problem classes. A 
class description is a set of first order statements and a set of queries about specific 
or mixed facts. Clearly, the system's applicability is limited to classes of problems 
that can be stated in this language. An example of a problem that can not be stated 
naturally in this language is one requiring a proof by induction. 

There are also problem classes that can be stated but for which the system designs 
an incomplete representation. A representation designed for a problem is incomplete 
when it captures all of the constraints on the problem but it can not be used to 
reliably answer the types of queries given in the problem statement. For example, a 
representation for a problem is incomplete if that representation captures all of the 
constraints on the problem, the problem contains the query □</>, <j> follows from the 
problem, but the system answers the query with unknown. 

Note that the notion of a complete representation is much weaker than the notion 
of a complete deductive system for first order logic because queries are restricted 
to ask about specific or mixed facts with all mentioned individuals being named 
individuals. This means that a representation can be complete without being able 
to correctly answer queries about general statements or statements with existential 
quantification. 14 

One source of incompleteness is that, in general, the system can not design repre- 
sentations for disjunctive problem situations. The difficulty is best illustrated by an 
example. Consider the following propositional problem: 

WvT 
RV S 

(WAS)=>Q 
(TAR) =>Q 
(TAS)^Q 
Query: UQ. 

Q follows in this problem. However, the representation that the system designs will 



14 Note also that even though section 1.3 discusses using representations to answer queries about 
general facts, these are not included in the notion of a complete representation and we will never 
guarantee their completeness. 
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not determine this. The difficulty is that there is no way to create a problem situation 
representing W V T without committing to one of them actually being true. In this 
problem, the truth value of Q depends only on W \/ T and RV S being true. 

One way to handle disjunction like this is to use the representation that the system 
designs to reason by cases. For example, a problem solver could use the above rep- 
resentation by assuming W A R and noting that Q follows; then assuming WAS 
and noting the same; and so on. By trying all possibilities, it could establish that Q 
follows. One reason that the system does not design representations that reason by 
cases to enforce constraints is that doing so requires exponential time. 

Note that, even though the system does not design representations for arbitrary 
disjunctive situations, it has special purpose techniques for designing representations 
for certain restricted forms of disjunction. 



1.6 Related Work 

This section highlights the differences between the research reported in this thesis 
and previous work in several areas. A more thorough treatment can be found in 
Chapter 8. 



1.6.1 Solving Word Problems 

One might expect there to be an interesting relationship between my system and 
previous systems that solve word problems. However, most work in this area has been 
concerned with translating an English problem statement to a pre-established target 
representation. For example, the object of STUDENT [Bobrow68] was to translate 
a high school level algebra word problem into a system of algebraic equations. Also, 
Bobrow was primarily concerned with natural language problems and I have not 
addressed this. 
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1.6.2 Automatic Programming 

Most automatic programming systems perform tasks such as algorithm design, data 
structure selection, and optimization (of algorithms and data structures) within a 
fixed representation which is chosen by a person prior to the point where the system 
gets involved. By contrast, my system is concerned with earlier steps in the problem 
solving process during which a representation is designed. To illustrate this difference, 
consider the work on data structure selection reported in [Barstow79j. This system 
would, for example, select a data structure for implementing some set described in 
a program specification. By contract, my system is concerned with identifying the 
"right" sets with which to represent the problem. In most problems, it introduces 
new sets because they can be implemented more efficiently. Furthermore, my system 
chooses the same data structure to implement all sets. Thus it has an entirely different 
level of concern. 

Many efforts in automatic programming have generated a program from a formal 
specification including axioms describing the desired program and the programming 
language operations available to implement that program. As many researchers work- 
ing in this area have pointed out, the formulation of these axioms can have a dramatic 
effect on whether this approach succeeds. Searching for better formulations of a set 
of axioms is a large part of what my research is about. 

1.6.3 Good Representation 

As I have already stated, the principle difference between my research and previous 
efforts to understand what makes a representation good is that they were concerned 
with recognizing the properties of good representations, while my research is about 
generating such representations prospectively. 

1.6.4 Problem Reformulation 

In some ways my work can be seen as an extension of [Korf80l Korf was concerned 
with characterizing a space of possible representations and types of transformations on 
representations. My work is concerned with how to choose the right transformations 
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to do to arrive at a good problem representation. I have identified some of the essential 
properties of representations and given a method to design representations with those 
properties. 

Korf (and Amarell Amarel68]) viewed problem solving as state space search and ob- 
served that changes in representation (i.e., the description of a problem state) affect 
the size of the space. The focus of my work has been explaining how representations 
do this and how to design representations that yield smaller search spaces. The claim 
is that one way to reduce a problem's search space is to design a representation that 
captures more constraints in its structure and behavior. 

1.6.5 Specialized Reasoners 

There is a body of work whose concern is to develop a framework in which multi- 
ple specialized reasoners can be used together in problem solving (see, for example, 
[Brachman et.al84, Miller k Schubert 88]). Each reasoner is specialized to perform 
certain inferences efficiently and together a collection of such mechanisms can be 
used to accelerate a general problem solver (usually a theorem prover). My research 
complements this work because my system can be viewed as designing specialized 
reasoning systems. 



1.7 Reader's Guide 

The next chapter explains the three evolving descriptions that the representation 
design system maintains throughout the design process. One can think of these as 
descriptions of the problem statement, the representation being designed for that 
problem, and the relationship between the first two descriptions. The chapter also 
describes a language for defining and instantiating abstract data types. This is used 
by a person defining library structures and by the system in introducing new repre- 
sentations. 

Chapter 3 describes how the system derives a description of a problem's initial rep- 
resentation-. This is done in three steps: 
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1. The primitive concepts are identified from a problem statement. 

2. Irrelevance is removed from the problem. 

3. The representation of each relevant primitive is added to the description. 

Chapter 4 discusses the classification process and the knowledge used in the process. 
Chapter 5 discusses the concept introduction process, explains how classification and 
introduction are integrated as extended classification, and gives detailed examples 
from extended classification in the FAMILIES problem. Chapter 6 discusses the 
operationalization process. 

Chapter 7 details the way representations are implemented. First, it explains the 
equality system relied on by representations and a mechanism for creating anonymous 
individuals. It explains the behavior of the library structures individual, relation, 
function, and set and how they integrate with the equality system. An example is 
given of the system creating a specialized representation. 

Finally, Chapter 8 discusses related work and Chapter 9 provides a summary and a 
discussion of future work. 



Chapter 2 

Descriptions Used in 
Representation Design 



The next four chapters discuss the representation design process in more detail. In 
preparation for this, this chapter explains the descriptions that the representation de- 
sign system maintains throughout the design process. One can think of these as three 
evolving descriptions: one of the problem statement, one of the representation being 
designed for that problem, and the third a description of the relationship between 
the first two. 

The problem statement language (PSL) is a sorted first order logic. The representation 
description language (RDL) is a collection of constructs for defining abstract data 
types implementing representations and representation prototypes. 



2.1 The Problem Statement Language 

Problems consist of a collection of statements and a collection of queries. The state- 
ments include sort declarations for the domain individuals (e.g., N : family-member) 
and statements in the logic. The logic is sorted and contains the distinguished re- 
lations G and -. For convenience, appropriate syntax is included so that terms can 
denote extensional and intentional sets. Extensional sets are denoted by terms of the 
form, {£1,..., £„}, where each of the U are terms. Intentional sets are denoted by 
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terms of the form, {x j 0}, where <f> is a formula in which x occurs free. 1 

The query language allows necessity, possibility, find-all, and find-the queries. Ne- 
cessity queries, written □<£ (where is a specific or mixed statement in the sorted 
logic), ask if a given fact is true in a problem situation. Possibility queries, written 
O0?, ask if a fact is consistent with a problem situation. Find-all queries, written 
find-all x \ <t> (where x is free in <j>), ask for all individuals in a problem that stand in 
some relation to a specific individual. The FAMILIES problem query is an example 
of this type: 

find-all x i sibling(S,x). 
In addition, an answer to a find-all query states whether the individuals found are 
all the individuals for which <p is true, or just all the individuals that could be found 
given the problem statement. Find-the queries are written find-the r, where r is a 
function application term in the logic. These queries ask for the individual that the 
function application denotes in a problem situation. 



2.2 A Language for Defining Representations 

Representations and prototypes are implemented as abstract data types (ADTs). 
The representation design system has a language for defining ADTs. It is used in 
two ways: by a person to define structure prototypes and by the representation 
design system to design new representations in terms of prototypes. Prototypes are 
implemented as parameterized ADTs, for example, the library structure relation is 
parameterized so that instances can be created with different arities and over different 
sorts of individuals. 

Library prototypes are defined by a person using the deftype construct. Its form 
is: 2 



1 These extensions to the syntax are not formally required. We can, for example, replace the term 
{a; | <£}, by a variable s and then include the formula Vx[x € * ^> 4>]. 

2 The syntax used here is a stylized version of the actual syntax used in the implementation. The 
actual syntax is similar to LISP flavor syntax. 
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(deftype type-name (parameter-list) 
[declaration-list] 
[proc-list]) 

Type-name is a symbol that names the type being defined. Parameter-list is a list 
of formal parameters that are bound to particular types and constants when instances 
or subtypes of type-name are defined. The declaration-list defines the structure 
part of the new type. It is a list of instance definitions. Thus new types are defined 
with existing types as components. Proc-list is a list of definitions for procedures 
that operate on the structure components to provide the abstract operations of the 
new type. For example, here is part of the definition of the library type relation 
(some of the procedures have been omitted): 

(deftype relation (doml:type ftrest domains: list(types)) 
[relation-list: list (tuple (doml ftrest domains))] 
[(defproc add-relationship (rel: tuple (doml ftrest domains)) 

(push rel relation-list)) 
(defproc related? (rel: tuple(doml ftrest domains)) 
(member rel relation-list))]) 

The type relation takes as parameters a list of n domains and produces a represen- 
tation of an n-ary relation. For example, this type is instantiated to define married. 
The domains in the parameter list are the names of types representing sorts. The 
data structure part of relation is a list of n-tuples. List is another type defined 
in this language. It has a push operation and a member predicate. Representations 
defined as relations store n-tuples of individuals that stand in a relation in a par- 
ticular problem situation. For example, when creating a problem situation for the 
FAMILIES problem, new pairs like < N,P > get added to the relation list inside 
married. This is done by executing the procedure add-relationship with < N,P > 
as its argument. 

The RDL has facilities to define prototypes that inherit structure and behavior from 
other prototypes. For example, one can define a prototype sym-ref-rel for symmet- 
ric reflexive relations in terms of a prototype for symmetric relations and a prototype 
for reflexive relations. 
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The system designs representations for concepts and sorts in a problem. Represen- 
tations of concepts are designed by instantiating library prototypes. The RDL has a 
construct for this purpose which has the form: 

instance-name: type-name(parameters). 

As an example, married is defined as 

married: relation (family-member, family-member). 

Representations of sorts are designed as subtypes using one of the following con- 
structs: 

(deftype new-type specializes type-name (parameters) ) 
(deftype new-type specializes type-name(parameters) 
disjoint) . 
For example, there is a library prototype called individual. A representation is 

designed for the sort family-member as a subtype of individual: 

(deftype family-member specializes individual). 

Instances of this type are used to represent individual family members like Q. Subtype 
definitions have the standard semantics: every instance of a type is also an instance 
of its supertypes. 

The second form of the subtype construct is used to define disjoint subtypes, i.e., 

subtypes of a type Tx that are disjoint from all other defined subtypes of T x . For 

example, we can define the following disjoint subtypes of family-member: 

(deftype male-fm specializes family-member disjoint) 
(deftype f emale-fm specializes family-member disjoint) . 

Once a type is defined to be a disjoint subtype of some type Tj, all other types 

defined as subtypes of Ti are assumed to be disjoint. Hence, once male-fm is defined 

as above, all other types defined as subtypes of family-member are assumed to be 

pair disjoint from each other and from male-fm. 

The system always defines types to represent domain individuals as disjoint subtypes 
of individual. We did not do this in the example above because there is only one 
type of domain individual in the FAMILIES problem and thus in this case it does 
not matter which form of definition we use. 
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2.3 Maintaining the Representation Mapping 

The system maintains the representation mapping for concepts by maintaining a data 
type definition for the representation of each concept, e.g., 

married: relation (family-member, family-member). 
One part of the design process involves selecting more specialized library types for 
representing concepts and redefining their representations in terms of those more 
specialized types. The other part of representation design involves introducing new 
concepts (and reformulating the problem). As new concepts are introduced, repre- 
sentations get defined for them. 

An integral part of this process is the introduction of new sorts. We describe the 
role that sorts play in the representation design process by first reviewing the general 
role of sorts in logic and automatic theorem proving and then showing that the 
representation design system uses sorts in much the same way as a person developing 
a sorted formulation of a problem. 

It is well known that a sorted logic is equivalent to an unsorted logic containing 

a unary relation symbol for each sort, but that theorem proving in a sorted logic 

can be significantly more efficient [Cohn88]. There are two reasons for this. First, 

specialized inference rules are known that enforce relationships between sorts by 

restricting the sorts of objects that can unify. One can view these specialized inference 

rules as capturing taxonomic constraints between sorts. Second, sorted formulations 

are typically simpler than unsorted formulations. This is because some constraints 

between concepts are re-expressed as relationships between sorts and handled by 

the sort machinery. Therefore, the logical statements expressing those constraints 

are removed from the problem. Also many of the remaining statements simplify 

because sort predicates are removed. For example, in an unsorted formulation of the 

FAMILIES problem, the symmetry of married might be stated 

'i x\/y[f atnily-member{x) A family -member \y) 

=> (married(x,y) => married{y,x)),] 
while in the sorted version this statement is simplified to 

VxVy[married[x,y) => married(y,x)}, 
given-maraed : family- mtmh*r x family-member. 
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When formulating a problem in a sorted logic, a person declares sorts, specifies rela- 
tionships between those sorts, gives the sort signature of each problem concept, and 
then formulates sorted versions of the problem statements. Sorted logics differ in the 
relationships allowed between sorts; typically taxonomic relationships are specified. 
We will call the collection of relationships specified between the sorts in a problem 
the problem's sort structure. For example, in a problem concerning several different 
kinds of animals, one might specify the sort structure as the sort/subsort taxonomy 
shown in Figure 2.1. 

Animal 




Fox Wolf Bird Mammal 

/ \ 

Human Whale 

Figure 2.1: A sort tree for a problem about animals. 

There is always more than one way to sort a problem. Different sort structures 
capture different relationships between sorts and lead to different formulations of a 
problem. For example, the formulation of a problem in terms of the sort structure in 
Figure 2.1 captures a number of constraints in the sort structure, e.g., 

\/x[Fox(x) =r- Animal(x)}, 
while a sort structure that did not include one of these sorts would not capture 
the above constraint. 

Clearly, a skilled person formulating a problem in a sorted logic will attempt to find 
the formulation that captures the most constraints in the sort structure. 

Like the skilled person, the representation design system incrementally builds up a 
more sorted formulation of an input problem as design proceeds. The introduction 
of new sorts is the system's mechanism for picking out sets in a problem to design 
representations for: As new sorts are introduced, representations are designed for 
them. This allows the system to capture two kinds of constraints: sort/subsort 
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constraints and constraints between or common to the elements of a sort. It captures 
sort/subsort constraints by building a type taxonomy with the representations of 
sorts. There are special library structures for capturing constraints between elements 
of a sort. The system captures such constraints by designing representations for sorts 
in terms of these special structures. 

Normally, when formulating a problem, a set of sort symbols is introduced and then 
concepts (e.g., relations or functions) are defined in terms of these. The system 
introduces concepts and sorts simultaneously. Every new concept is logically defined 
in terms of an existing concept. Section 1.2.9 explained how such definitions are used 
as rewrite rules. Logical definitions of new concepts are also used to derive new sorts. 

When an introduced concept is a relation, the system treats it as a predicate defining 
a new sort: The sort defined is the subsort of the relation's domain for which the 
relation is true. For example, the following statement introduces the relation couple: 

Vx\/y[couple({x,y}) 44> married(x,y)}. 3 

It is treated as denning a new sort, which I will call married- couple, whose individuals 
are those doubleton sets (of family members) for which couple is true. 

When an introduced concept is a function, the system defines a sort for its range 
elements. For instance, when designing a representation for the FAMILIES problem 
the system introduces brother sets by introducing brothers, a function mapping an 
individual to his/her set of brothers. It is given the following logical definition: 

VxVy[x e brothers{y) «=> brother(y,x)}. 

This is taken as denning a new sort for the range elements of brothers, call this sort 
brother-set. Brother-set is a subsort of set( family-member), i.e., brother sets are 
sets of family members. A representation is defined for brother-set as 

(deftype brother-set specializes set (family-member)), 
making brother-set a subtype of set (family-member). Also a representation is 
defined for brothers as 

brothers: function (family-member, brother-set). 



For clarity of presentation, we will always write logical definitions with the new concept appear- 
ing on the left hand side. 
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This is an unusual way to define a sort. Normally we would define not de- 
fine a sort for brother sets, instead we would give the signature of brothers as 
family-member —> set{ family-member). However, one of the reasons for intro- 
ducing brothers is to investigate the structure of its range in an attempt to design a 
specialized representation for it. The system's mechanism for doing this is to define 
brother-set to be exactly the range of brothers. In exposition, we will often call such 
a sort a range and call the associated function its defining function. 

Note that while defining sorts like brother-set for the ranges of functions is unusual, 
it is equivalent to defining a sort predicate and then defining the function with the 
new sort. For example, we can define the sort predicate brother-set as 

V 'x[br other- set(x) v=- 3y(y 6 {: \ brother(x, :)})} 
and then specify the signature of brothers as family-member — > brother-set. 



Chapter 3 

Deriving an Initial Representation 



This chapter explains how a description of the initial representation is derived from 
a problem statement. The processes described here result in a complete represen- 
tation that is the starting point from which a maximally constrained representation 
(preserving completeness) is designed. 

The goal in deriving a description of the initial representation is to identify a set of 
concepts from which to design the specialized representation. This set, which we will 
call the set of represented concepts, should be sufficient to express all the constraints 
of a problem class. In addition, it is desirable that it include only concepts that are 
necessary for expressing the constraints of a problem class and responding to the 
query. 

We first discuss a sufficient condition. A primitive concept is one that is not defined 
in terms of other concepts. 1 To be sufficient, the set of represented concepts should 
include all the primitive concepts that are relevant to solving the problem. Primitive 
concepts are required because a representation can not be designed to capture all the 
relevant constraints without them. Suppose a relevant primitive concept is left out. 
Since this concept is relevant, it must have constraints defined on it that are used in 
solving the problem. But since the concept is not represented and not definable in 
terms of represented concepts, a representation can not be designed to capture those 
constraints. 



'The set of primitive concepts of a problem may not be unique. When concepts are mutually 
defined, one is chosen arbitrarily as a primitive. 



oo 
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Relevant concepts with restrictions 2 on them are included in the representation de- 
scription even if they are not primitive. For example, the statement, 

V 'x^br other \M, x), 

causes brother to appear in the set of represented concepts for the FAMILIES problem 
even though it is not a primitive. 

These concepts are included in recognition of a fact about the representation de- 
sign process: The strategy that the representation design system uses to capture 
restrictions is to introduce alternative concepts that transform the restrictions into 
specific constraints. For example, to capture the constraint of the statement above, 
the concept brothers is introduced and the statement is transformed into the specific 
statement, 

brother s(M) = 0, 
where brothers is a function from an individual to his/her set of brothers. 

Such introductions are done by extended classification. Since the only concepts that 
get classified are those that are represented, concepts with restrictions on them are 
included so that they will get classified. 

Concepts appearing in find-all queries are also included in the representation descrip- 
tion even if they are not primitive. Like defined concepts with restrictions on them, 
concepts appearing in find-all queries are included so that they will be subjected to 
extended classification. Extended classification of these concepts will usually cause 
the problem to be reformulated so that the find-all query becomes a find-the query. 
For example, sibling is included in the representation description of the FAMILIES 
problem because it appears in the problem's find-all query. Extended classification 
of sibling introduces the function siblings, mapping an individual to his/her set of 
siblings. This allows the problem query to be reformulated as 

find-the sibling s{S). 

Save the exceptions just discussed, it is desirable that the set of represented concepts 
include only primitives. The basic reason for this is economy of mechanism. Since 



2 Recall that a restriction is a mixed statement that restricts the individuals that stand in some 
relation to a specific individual. 
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instances will be designed for all the represented concepts, representing more concepts 
than are strictly necessary can cause unnecessary machinery to be designed. This will 
not only cause the representation design system to do more work than necessary but 
it will also cause extra work to be done in maintaining redundant constraints when 
the representation is used. 

Two types of concepts are considered unnecessary: those that are irrelevant to an- 
swering a problem's queries and those that are not primitive, do not have relevant 
restrictions on them, and do not appear in find-all queries. 

When a defined concept is represented, redundant general constraints will be main- 
tained by the resultant representation. For example, suppose brother is defined as 

VxVy[brother(x,y) ^ sibling{x,y) A male(y)} 
and that the concepts sibling and male are represented. Designing a separate repre- 
sentation for brother will cause its constraints to be captured redundantly because the 
representation brother will be designed to capture the constraints on brother. For 
instance, brother will capture the irreflexivity of brother. But notice that sibling 
will capture the irreflexivity of sibling and since brother(x,y) => sibling(x.y), the 
irreflexivity of brother will be captured twice. 

We now proceed to discuss the four steps that identify the set of represented concepts: 

1. An initial set of primitive problem concepts is identified. As we will see, when 
there is missing definitional information, it is acquired in this step. 

2. Concepts that are irrelevant are removed from the problem statement and from 
the set of represented concepts. The procedure that does this is sound but not 
complete: Only irrelevant concepts will be eliminated but it is possible for it to 
miss irrelevant concepts. Irrelevance is identified after the primitive concepts 
because the additional definitional knowledge acquired in step one can change 
what is relevant. 

3. Concepts that have explicit restrictions on them are identified and added to 
the set. Concepts that have implicit restrictions on them (i.e., those for which 
a restriction follows from the problem but is not stated initially) are identified 
later in representation design. 
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4. Concepts appearing in find-all queries are added to the set of represented con- 
cepts. 

A description of the initial representation is constructed by defining representations 
for the concepts identified in the four steps above. The initial representation of each 
concept is isomorphic to its sort signature. 

The next three sections of this chapter describe how steps 1-3 above are performed. 
Step four is straightforward. Derivation of the initial representation from a problem's 
sort signature is also straightforward. However, problem statements often give only 
partial sort information. The last section discusses some simple techniques that the 
system uses to complete a sort signature when only partial information is given. 

3.1 Identifying The Primitive Concepts of a 
Problem 

A concept is considered to be defined when there are equivalent necessary and suffi- 
cient conditions for it. The simplest case of this occurs when a concept appears alone 
on one side of a biconditional. For example, the following is a definition for brother: 

\/jNy[brother(x,y) <£> sibling(x,y) A male(y)]. 

Of course, the simple syntactic strategy of looking for such biconditionals will miss 
concepts with necessary and sufficient conditions that are semantically equivalent but 
syntactically different. For example, if we take a definition and break it up into two 
implications, the simple strategy will no longer recognize it. 

The system uses is a more general technique that depends on a connection graph built 
during the irrelevance analysis. This technique is described below in section 3.2.3. 
Here we point out that even the more general technique will sometimes fail to identify 
definitions. However, this never causes the system to design an incorrect represen- 
tation; it just causes the system to design extra machinery enforcing constraints 
redundantly. 

Identifying primitive concepts can be complicated by the fact that real problems are 
incomplete. An incomplete problem is one that does not supply sufficient information 
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to answer its queries. When a problem is incomplete, we can not determine whether a 
concept is primitive since its definition may simply be missing. In recognition of this 
possibility, the representation design system asks the user whether he/she would like 
to define any of the concepts that are primitive in the problem statement. Acquiring 
definitions for such concepts often uncovers additional concepts that did not appear 
in the problem statement. The process of prompting for new definitions continues 
until the user declines to further define any of the concepts that the system believes 
to be primitive. 

Acquiring definitional information is one of the techniques that the representation 
design system uses to try to complete problem statements. It is clear that this 
technique is at the mercy of the user. There is nothing the system can do if he/she 
declines to provide a definition that is required to complete the problem statement. 

3.1.1 Section Summary 

Throughout the rest of this chapter and the next three, we will summarize each 
section by showing how the processes described in it change the statement of the 
simplified FAMILIES problem shown in Figure 3.1. 

P : family-member , Q : family-member, 

R : family-member, S : family-member 

grandchild(Q , 5) 

\/x child(P,x) <$■ x = R 

married(Q, P) 

Query: find-all x \ parent(S,x) 

Figure 3.1: A small problem about families. 

The process of acquiring missing definitional knowledge expands this problem state- 
ment to the one shown in Figure 3.2. 

In addition, child and married have been identified as primitives. Note that parent 
could have been identified as primitive instead of child. However, both of these can 
not be primitive because of the biconditional between child and parent. Note also 
that the choice of one of these as primitive does not affect the representation that 
the system designs. 
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P : family-member, Q : family-member, 

R : family -member, S : family-member 

grandchild{Q, S) 

Vx child(P, x) «=>■ x = R 

married(Q , P) 

yxiy[grandchild(x,y) <^ 3z{child(x,z) A child{z,y))} 
VxVy[child(x,y) <^ parent( y, x )] 
Query: find- all x \ parent(S,x) 

Figure 3.2: The example problem with definitions added. 

3.2 Eliminating Irrelevant Information 

It is desirable to eliminate irrelevant information before a representation is designed 
because we do not want representations to capture irrelevant constraints. Part of 
my approach is to design representations for problems before solving them. 3 In fight 
of this, I have developed an irrelevance filter that does not require a problem to be 
solved to use. However, it does not always eliminate all irrelevance. 

The filter is guaranteed not to eliminate relevant information as long as it is run 
on a complete problem statement. However, the system runs the filter right after 
primitive concepts are identified. Subsequent steps in representation design may 
further complete a problem statement and, therefore, running the filter at this point 
may cause relevant information to be eliminated. Therefore, the system allows for 
the possibility that information filtered out of a problem statement may later be 
"resurrected." 

The discussion that follows begins by defining the class of irrelevance that is usefully 
eliminated in representation design. The filter is described and proved sound. Also 
it is shown to be incomplete. 

3.2.1 Strong and Weak Irrelevance 

We define two classes of irrelevance: strong and weak. A fact is strongly irrelevant to 
a problem P if it can not be used to derive the answer for any problem in P's class. 

3 An alternative approach is to solve a problem and then design a representation with the benefit 
ofhind sight. This representation can be used to solve other similar problems more efficiently. 
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A fact is weakly irrelevant to a problem P if there is at least one problem in P's class 
whose answer can be derived without the fact. We will also say that a fact that is not 
strongly irrelevant is weakly relevant because it is relevant to at least one problem in 
the class. 

Since representations are designed for problem classes, weak irrelevance should not 
be eliminated and, therefore, our interest is in identifying strong irrelevance. 

A direct method for identifying strong irrelevance would be to generate all the prob- 
lems in the input problem*s class, solve them, and then identify statements that were 
not used in any of these proofs. Clearly this method is impractical. 

The method that the representation design system uses abstracts away from the 
original problem, retaining only its propositional structure. It then identifies a subset 
of the statements that can not appear in any proof in the problem class based on the 
connectivity of the problem's propositional approximation. 

I will show that the facts that can not appear in any propositional proof are strongly 
irrelevant. I will also show a counter example to the converse, illustrating that the 
method does not eliminate all strong irrelevance. 

3.2.2 An Approximate Strong Irrelevance Filter 

The filter begins by constructing a propositional version of the problem by replacing 
the atomic formulas in each problem statement by propositional symbols, retaining 
the propositional structure of each statement. Then this version of the problem is 
converted to clause form. Next, the query fact is extracted from the problem query, 
converted to propositional clause form, and added to the clause set. Finally, a con- 
nection $rap/i[Kowalski75J is used to determine which clauses are strongly irrelevant. 

The nodes of the connection graph correspond to the propositional clauses and are 
linked together when they contain literals that resolve. 4 Any clause that contains a 
literal that is not connected to any other clause in the graph is strongly irrelevant. 

Disconnected clauses are called impure in the connection graph literature and elim- 



4 In a connection graph built from a set of first order clauses, the links between nodes are labeled 
with the most general unifier of the two literals that resolve. 
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ination of impure clauses is a standard connection graph technique for eliminating 
irrelevance. However, using a prepositional connection graph to eliminate irrelevance 
is non-standard and results from our interest in strong irrelevance. Recall that dif- 
ferent problems in the same class can have different equality relationships between 
terms of a sort. The irrelevance elimination procedure must, therefore, allow for 
equality between any terms. This fact requires us to modify the standard irrelevance 
elimination procedure. One way to do this is to add axioms for each predicate symbol 
in a problem so that, given any true fact, we can derive an equivalent true fact by 
substituting equals for equals. It turns out that once such axioms are included, the 
same clauses are impure in the first order connection graph as in the prepositional 
connection graph created from the same problem statement (minus the equality ax- 
ioms). 

Having identified the strongly irrelevant clauses, strong irrelevance is eliminated from 
the original (non-clausal) problem statement by identifying the statements from which 
those clauses were derived and simplifying those statements (or in some cases remov- 
ing them). This problem statement is what is used in the representation design 
process. 



Deriving the Propositional Problem 

The first step is to extract the propositional structure of the problem. This is done 
by substituting a predicate's name — as a propositional symbol — for any atomic 
formula referencing it. For example, P(x) becomes P; ^P(x) becomes -P. In 
doing this, equalities are replaced by the predicate symbol EQUAL and membership 
expressions are replaced by the symbol MEMBER. As this is done, a record is kept 
of the original statement each propositional statement is derived from. Also, sort 
statements are removed in this step. 

Figure 3.3 gives an example that we will use to illustrate the methods explained in 
this section. This is the small FAMILIES problem with statements added connecting 
married and child. The result of transforming this problem as described so far is 
shown in Figure 3.4. Note that the query has temporarily been removed. 

The next step in this process is to extract a fact from the problem query, convert 
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P : family- member, Q : family-member, 

R : family-member, S : family-member 

grandchild(Q, S) 

\/x child(P,x) <£> x = R 

married(Q, P) 

VxiyMc[child{x, c) A child(y, c) A x ^ y => married(x,y)} 

Vxiy\/c[married(x,y) A child(x,c) => child{y,c)\ 

VxVy[grandchild(x,y) <£> 3z(child{x,z) A child(z,y))} 

\/xiy[child(x,y) & parent(y,x)} 

Query: find-all x \ parent(S,x) 

Figure 3.3: The example problem with the relationship between child and married 
added. 

grandchild 

child & EQUAL 

married 

child A child A -> EQUAL ==• married 

married A child => child 

grandchild ^ child A child 

child &■ parent 

Figure 3.4: The prepositional form of the FAMILIES problem with acquired defini- 
tions added. 

it to propositional form, and add it to the propositional problem statement. The 
procedures for extracting a fact from each of the query types are as follows: 

1. For queries of the form D <j>, the fact extracted is -.^. This is exactly what is 
standardly done in building a connection graph: the statement to be proved is 
negated. 

2. To understand how a fact is extracted from a query of the form O <f>, recall that 
these queries are answered by adding 4> to a problem situation and looking for 
a contradiction. This is equivalent to trying to prove U-«j> and reporting that 
(p is possible unless the proof succeeds. Therefore, 4> is the fact extracted from 

O (p. 

3. For queries of the form find-all x | <f>, the fact extracted is also -^ because in 
order to answer queries of this type the system must construct proofs of <f> for 
all individuals that it can. 
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4. The system can only answer find-the queries when the functions mentioned in 
them are defined in terms of relations. For example, the query, "find the mother 
of A"' can be answered because mother-of is defined in terms of the relations 
parent and female. Expanding a find-the query into a fact containing only 
relations is necessary because the irrelevance filter works with the propositional 
structure of a problem which does not include functions. The system derives a 
fact for a query of the form find-the r by using definitions to expand the formula 
3X x = t into one that mentions only relations. For example, the system uses 
the definition of mother-of to expand the formula x = mother-of (A) into 

3x parent(x, A) A female(x). 
This is the fact extracted from the query, "find-the mother-of (A). 

Extracting the query fact from the problem in Figure 3.4 adds -parent to the propo- 
sitional problem statement. 

The next step is to convert the propositional problem statement to clause form, still 
keeping track of which original problem statement each clause came from. 

Before the graph is built, the literals EQUAL and ^EQUAL are deleted from the 
clauses. This has the effect of assuming that no clause should be considered impure 
on the basis of an equality or disequality; this in turn is equivalent to assuming that 
(dis)equalities are always relevant. This is exactly what we want when reasoning 
with respect to a problem class because it is usually possible to construct another 
problem in this one's class in which a (dis)equality is relevant, making EQUAL and 
^EQUAL clauses weakly relevant. 

Building the Connection Graph 

This process begins by dividing the clauses into two sets. The query clauses are those 
derived from the query fact, the others are called the descriptive clauses. 

When the graph is complete, all descriptive clauses that do not appear in the graph 
are marked strongly irrelevant. Then the graph is checked for impure clauses. These 
are marked strongly irrelevant and deleted from the graph. These deletions may 
cause other nodes to become impure, in which case these are marked and deleted. 
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This process continues until no more impure clauses can be found. 5 

As an example, consider Figure 3.5. This is the problem of Figure 3.3 with the 

following strongly irrelevant statements added: 

Vx[child(P,x) => hair color (x, RED)} 
\/x[haircolor(x, RED) => pigeon-toed(x)} 



P : family-member, Q : family-member, 

R : family-member, S : f amily -member 

grandchild(Q, S) 

Vx child(P,x) <^> x = R 

married(Q, P) 

"ix\/y\/c[child{x,c) A child(y,c) A x ^ y => married(x , y)] 

\/xVyVc[married(x ,y) A child(x,c) => child(y,c)} 

\/xiy[grandchild{x,y) <^ 3z(child(x,z) A child(z,y))) 

\/xiy[child(x,y) <£> parent(y,x)] \/x[child{P,x) => hair col or (x, RED)} 

\/x[hair color (x, RED) =$■ pigeon-toed(x)} 

Query: find-any x \ parent(S,x) 

Figure 3.5: A problem used to illustrate the irrelevance filter. 



The propositional form of this problem is shown in Figure 3.6. The clausal proposi- 
tional form is shown in Figure 3.7, while Figure 3.8 shows the connection graph built 
from the clausal propositional form. The node enclosed in a rectangle in that figure is 
impure. This node corresponds to the propositional clause, ~^hair color \l pigeontoed, 
which, in turn, corresponds to the statement, 

\/x[haircolor(x, RE D) — - pigeontoed(x)}, 
in the original problem. 

This node is deleted from the graph and the above statement is deleted from the 
problem. In doing so, the node corresponding to the problem statement 

\/x[child(P,x) => hair color (x, RE D)} 
becomes impure. Again this node and corresponding statement are deleted. 



5 This is a standard process, see [Kowalski75>. 
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grandchild 

child <& EQUAL 

married 

child A child A -^EQUAL =? married 

married A child => child 

grandchild ^ child A child 

child <^ parent 

child => hair col or 

haircolor =±> pigeon-toed 

-^parent 

Figure 3.6: The propositional form of the example problem 

grandchild 
-ichild, child 
married 
■^child V married 

-^married V -^child V child 

-^grandchild V child, -^child V grandchild 

child <& parent -<child V parent, -^parent V child 

-<child V haircolor 

^haircolor V pigeontoed 

^parent 

Figure 3.7: The clausal propositional form of the example problem 
Properties of the Irrelevance Filter 

We begin by showing that the irrelevance filter is sound, i.e., that it only removes 
strong irrelevance. It is widely known that impure clauses in the first order connection 
graph for a problem can not be used in resolution proofs because they can not be used 
to derive the empty clause. Thus, since any problem is an instance of its own problem 
class and since impure clauses are irrelevant to the given problem, impure clauses are 
weakly irrelevant. A clause that is impure in the propositional connection graph for 
a problem must also be impure in the first order connection graph. Therefore, any 
impure clause in the propositional graph is at least weakly irrelevant. 

The definition of problem class disallows different instances that differ in the proposi- 
tional symbols they use or in the sense in which they appear. To see this consider the 
following example. If the clausal form of a problem does not contain the literal -/>, 
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child V haircolor 



parent V child 



child 



parent V -< child 



' child 



haircolor V pidgeontoed 



child V -i grandchild 



child V grandchild 



child V married 



parent 



child V -^married V -> c^w'/d 
I— I 



Figure 3.8: The connection graph for the example problem with irrelevance added. 



then the clausal form of the problem class also does not. Therefore, no problem in 
the class contains ->P in its clausal form. Since the connections in the propositional 
graph for a problem are between propositional symbols, if a clause is impure in the 
propositional connection graph for one problem it will be impure in the graph for 
every problem in the class. Thus, that clause is weakly irrelevant to every problem 
in the class, i.e., it is strongly irrelevant. Therefore, the filter eliminates only strong 
irrelevance. 

Note that eliminating impure clauses in the first order connection graph without 
including substitution axioms for every predicate in a problem can eliminate clauses 
that are not strongly irrelevant. For example, consider the following problem: 

P(a) 

Vz[R(g(x)) =•" Q(x)] 

Vz[P{x) => R(f(x))] 

Query: DQ(b). 
The clausal form of the second general statement is impure in the first order connec- 
tion graph. However, it is not strongly irrelevant because we can construct another 
problem in this one's class containing the statement f(a) = g(b). 

The irrelevance filter is not complete. In fact, making it so can be very difficult. For 
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example, if we add the statement 

VxVy[f(x) £ g{y)\ 
to the above problem, then 

Vx[P(x) =? R(f(x))} 
is strongly irrelevant. 

It seems reasonable for the filter to leave the above statement. However, consider 
a more blatant problem resulting from definitions. Suppose a problem coutains the 
definition 

yxVy\sibling{x,y) <^ 3p child(p,x) A child{p,y) A x ^ y\, 
mentions child elsewhere in the problem, but never mentions sibling. The prepo- 
sitional version of the problem will contain, 

sibling <e> child A child \ -EQUAL, 
and this will be included in the graph in spite of the fact that it is strongly irrele- 
vant. This is because sibling connects to itself through the definition. Concepts like 
sibling above that are irrelevant but are connected through a definition in the graph 
are called definitional ly irrelevant. Definitional irrelevance can be detected using the 
connection graph to identify concepts that are connected to themselves through a 
definition, but are otherwise disconnected. This is discussed below in section 3.2.4. 

Eliminating Strong Irrelevance 

Once a collection of strongly irrelevant clauses has been identified, the original prob- 
lem statement is simplified to remove all mention of them. The desired effect of this 
process should be the same as converting the problem to clause form and then re- 
moving the clauses that mention disconnected literals. However, getting the correct 
effect on a collection of non-clausal statements is more complicated. For example, 
suppose the literal P is disconnected and consider the difference between what should 
be done to the two statements, 

PV Q ^ R 
PAQ=> R 

The first statement should be simplified to Q =» R because either P or Q implies R. 

However, the second statement can be entirely thrown away because R*s truth value 
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must depend on P. 

The insight used in the elimination procedure is that when some literal is discon- 
nected, the solution to the problem will not depend on the truth value of the literal. 
Therefore, whether it is true or false, the problem solution will be the same. This in- 
sight translates rather directly to the following manipulation of a problem statement 
to eliminate irrelevance. Replace each statement S in the problem that contains a 
disconnected literal £ with, 

S[C/true\ V S[C/ false]. 6 

The new statement is then simplified according to the rules in Figure 3.9. 

-true ^> false 
-ifalse ^> true 
true AP » P 
false AP > false 
true V P 3> true 
false V P » P 
true =» P > P 
P => true 3> true 
false => P ^> true 
P => false > -P 
P & true > P 
P & false > -.P 

Figure 3.9: Simplification Rules ( » means "simplifies to"). 

Continuing the example above, if P is irrelevant, then, 

PV Q=> R 
becomes, 

[false V Q =• R\ V [true V<?=? PL 

and then simplifies: 
K> => R\ V /a/ae, 

<?^>P.' 

Similarly, 

PAQ^- P 

is transformed as follows. 



"The notation S[C;true: means the statement obtained by replacing C by true in S. 
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[false A Q => R] V [true A => R], 
true V [<? => flj, 

£rue. 
The second statement simplifying to true means that when P is irrelevant, the state- 
ment no longer adds a constraint to the problem. 



3.2.3 Using the Connection Graph To Identify Definitions 

As advertised, the propositional connection graph is used as part of a procedure 
that finds concepts with equivalent necessary and sufficient conditions. We will call 
the statements that constitute such conditions the definition of the concept. The 
technique uses the graph to identify a concept and a set of statements in the first 
order problem that might constitute a definition. Once identified the first order 
statements are checked to determine if they are, in fact, equivalent necessary and 
sufficient conditions for the concept. 

The analysis of the connection graph finds literals that are connected to themselves 
through general statements in the first order problem. The technique relies on the 
following ideas. A connection graph is pure when it contains no impure nodes. The 
set of clauses in which a literal and its negation appear is called the literal's reference 
set. The technique also makes use of a further subdivision of clauses into general and 
individual clauses. General clauses do not mention individuals, individual clauses do. 

After the graph is built, the general clauses of each literal's reference set are inspected. 

If they form a pure graph, the first order statements associated with those clauses 

may constitute a definition of the concept. For instance, consider the propositional 

clausal form of the definition of sibling: 

-child V EQUAL V sibling 
-sibling V child 
-sibling V -EQUAL. 

This set of clauses is a pure graph (see Figure 3.10). 

When the general clauses in a literal's reference set satisfy this condition, the system 
checks the original problem statements from which the clauses were derived to see 
if they are, in fact, a definition. To see that checking the associated first order 
statements is necessary consider a simple example that satisfies the above condition 
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siblings child ^siblingV ^equal 



-i child V equal V sibling 
Figure 3.10: Connection graph for the definition of sibling. 

but which is not a definition. Suppose a concept A has the following general clauses 

in its reference set: 

A(x,y) \/ iB{x,y) 
^A(x,y)y B(y,x). 
These constitute a pure graph, but they are not a definition of A since B(x,y) =$■ 

B(y,x) follows from these clauses. 

A set of statements constitutes a definition for a concept C if, given a statement 
not involving C, the only new unconditional facts that can be deduced from those 
statements involve C. For example, from the definition of sibling and statements 
involving child and EQUAL, we can deduce only unconditional statements involving 
sibling. For instance, given -child(A, B) and the definition of sibling we can deduce 
only ^sibling(A, B). However, the statements 

A(x,y)V^B(x,y) 
^A{x,y)VB{y,x) 
are not a definition of A because from them we can derive B(x,y) =*> B(y,x). 

3.2.4 Removing Definitional Irrelevance 

The irrelevance filter discussed above is a procedure that removes irrelevance based 
on strongly irrelevant clauses in the propositional connection graph. This section dis- 
cusses using the propositional connection graph to identify strong irrelevance missed 
by the filter. As discussed in section 3.2.2, a definitionally irrelevant concept can 
get connected to itself in a graph through its definition. A concept is definitionally 
irrelevant when it is defined and mentioned only in its definition. 

To detect definitional irrelevance, the system first uses the procedure described in the 
last section to find defined concepts. It then checks the reference set of each defined 
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concept. If it contains only clauses that are associated with the statements in the 
concept's definition, the concept is definitionally irrelevant. 



3.2.5 Eliminating Irrelevance in Incomplete Problem State- 
ments 

As mentioned, there is a trade off between running the filter early, eliminating relevant 
concepts, and running it later only to find that representations have been designed for 
irrelevant concepts. Since the cost associated with "resurrecting"' relevant concepts 
eliminated prematurely is much less than designing unnecessary representations, it is 
best to run the filter early and keep track of what gets eliminated to allow resurrection. 

After running the filter, the user is asked about another type of potentially missing 

information: connections between primitive concepts. The user has already been 

asked for a definition for every primitive concept. If none was given or if the definition 

given did not connect the concept, it may be that there is a missing necessary or 

sufficient condition for the concept that will. To address this possibility, the following 

heuristic is used: 

"When a literal is determined to be disconnected, ask the user if he can supply 
necessary or sufficient conditions for it." 

This is a heuristic in the sense that it can never be guaranteed to ask all potentially 
useful questions about the connections between concepts. Any concept could always 
connect to a non-primitive concept. The only way to augment this acquisition heuris- 
tic so that it will ask for all possible connections is to ask about possible necessary 
or sufficient conditions connecting every pair of concepts in the problem. This is 
impractical. 

When a new statement is provided at this point, it is converted to propositional 
clause form and added to the connection graph. This may cause statements that 
were earlier judged to be irrelevant to be resurrected. Also if any totally new con- 
cepts are mentioned in a new statement, the representation design system returns 
to the definition acquisition phase for those concepts and adds any new definitional 
information into the graph. 
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3.2.6 Section Summary 

Figure 3.11 shows the state of our example problem after definitional knowledge is 
added. When the irrelevance filter is run on this problem, married is identified as 
irrelevant. As a result, the system asks the user to supply necessary or sufficient con- 
ditions connecting married and child (the other primitive concept in the problem). 
The user presumably responds with the two statements: 

yxiy s ic[child{x,c) A child(y, c) A x ^ y => married{x,y)\ 
yxVy\/c[married(x,y) A child(x,c) =$■ child(y,c)\. 

In this case, running the irrelevance filter does not change the set of represented 

concepts identified in the last step. It remains as {child, married}. 

P : family-member, Q : family-member, 

R : f amily- member , S : family -member 

grandchild(Q,S) 

\fx child(P,x) ■=> x = R 

married{Q,P) 

\/xVy{grandchild(x,y) O 3z(child(x,z) A child(z,y))) 

\fx\/y[child(x,y) <£> parent(y,x)} 

Query: find-all x j parent(S.x) 

Figure 3.11: The example problem with definitions added. 



3.3 Including Concepts With Restrictions 

The system now adds to the set of represented concepts those that have explicit re- 
strictions on them. It looks for mixed statements that it recognizes as restrictions and 
adds the concepts in those statements. It recognizes four forms of mixed statements 
as restrictions: 

1. A statement that is an atomic formula with a negation at the top level, e.g., 
\/x~ibrother(M, x). 

2. A conditional statement containing one or more equalities between universally 
quantified variables Nan constants^ for example, \/x[child(P, x) ■■«• x — R). 
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This procedure will not identify concepts that have implicit mixed constraints on 
them. For example, it will not identify the restriction on brother implied by the 
statements 

\/x~sibling(M, x) 

VxVy[brother(x,y) •£> sibling (x,y) A male(y)}, 
i.e., Vx^brother(M,x). 

3.4 Completing a Problem's Sort Signature 

The steps described in the last few sections identify the set of represented concepts for 
a problem. The system defines a representation for each represented concept directly 
from its sort signature. For example, given that married is a represented concept, 
the system includes 

married: relation (family-member, family-member) 
in the initial representation description. 

Generally, analytical reasoning problems contain only partial sort information which 
the system completes using a few simple techniques. In cases where these techniques 
do not suffice in determining needed sort information, the system asks the user. 

A few restrictions are placed on problem statements to make it possible to extract 
sort information. An initial problem statement must make reference to one or more 
domain sorts that are assumed to be sorts of domain individuals. For example, there 
is one domain sort in FAMILIES which is referred to as family -member. All domain 
sorts are assumed to be disjoint from all others. 

All constants in the problem statement must have their sorts declared. For ex- 
ample, note that in the FAMILIES problem there is a sort declaration (e.g., N : 
family-member) for every individual mentioned. 

To describe the fact that family-member is a domain sort, the system declares it as 
follows: 

(deftype family-member specializes individual disjoint) 
This statement means that family members are a subtype of the individuals in the 
"world." 
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Given sorts for all the individuals, the system attempts to determine the domain 
of relations and the domain and range of functions by inspecting the statements in 
which they appear. 

It attempts to determine the domain of a relation in two ways. When there is a 

problem statement that relates individuals, the domain of the relation involved is 

defined from the sorts of the individuals. For example, from the statements 

A : family-member 

B : f amily-member 

married(A, B) 
it is determined that married is a binary relation on family members. 

When the domain of a relation can not be determined directly, the system looks for 
statements that use variables in the argument position in question. It then tries to 
determine the sort of that variable by its other uses in the statement. For example, 
given, 

sibling: relation (family-member, family-member), 
and the statement, 

\/xyy[brother(x,y) <=> sibling(x.y) A male(y)}, 
the representation design system can determine that brother is a binary relation 
defined over family members and that male is a unary relation over family members. 

The system attempts to determine the sorts in the domain of a function in the same 

way as relations. For the range of a function, it looks for uses of the function as an 

argument in a relation whose domain is known. If this fails to produce a sort for the 

range, then the system looks for an equality between an application of the function 

and another term whose sort is known. For example, from the statements, 

A : f amily-member 

B : family-member 

father(B) = A, 
father is defined as 

father: function(f amily-member, family-member). 

Problem statements are also checked for consistent use of functions and relations. 
When the sort of a function or relation is ambiguous in the problem statement, the 
system asks the user to disambiguate. 



76 CHAPTER 3. DERIVING AN INITIAL REPRESENTATION 

3.4.1 Section Summary 

The primitive concepts of our example problem are married and child. Because it 

appears in the find-all query, parents is also included. The representations of these 

concepts are defined as follows: 

married: relation (family-member family-member) 
child: relation (family-member, family-member) 
parent: relation(f amily-member, family-member) . 

3.5 Chapter Summary 

This chapter has described how an initial representation is derived from a problem 
statement. The basic strategy is to identify a collection of concepts that is sufficient 
for representing the problem and then to define representations for them. It is desir- 
able to identify the smallest collection of concepts that is sufficient for this purpose 
to avoid designing redundant representation machinery. The smallest collection of 
concepts that is sufficient is the collection of relevant primitive concepts. Although 
not logically necessary, the system also includes in this set defined concepts that have 
relevant restrictions on them and defined concepts appearing in find-all queries. 

The relevant primitive concepts are identified in three steps: 

1. The primitive concepts are identified and the user is given the opportunity to 
give definitions for any of these that are not, in fact, primitive. 

2. Irrelevant concepts are eliminated from the problem. A key idea in understand- 
ing irrelevance is the notion of strong irrelevance: a fact is strongly irrelevant 
to a problem class when it can not be used in solving any problem in the 
class. Since representations are designed for problem classes this is the kind of 
irrelevance we attempt to eliminate. 

3. Concepts that have explicit restrictions on them are identified. Some concepts 
that have implicit restrictions on them (i.e., those for which a restriction follows 
from the problem but is not stated) are identified later in representation design. 
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4. Concepts appearing in find-all queries are added to the set of represented con- 
cepts. 

The final step in deriving the description is to define representations for the concepts 
identified by the previous steps. The initial representations for the concepts identified 
in the steps above are isomorphic to the sort signature of those concepts. Analytical 
reasoning problem do not include complete sort information, but usually sort signa- 
tures can be derived from the information given by a few simple techniques. When 
this is not possible, the system asks the user for the missing sort information. 

The resultant representation is complete but does not capture any of the constraints 
of the problem statement. This means that the representation can be used to build 
problem situations in which, for example, married, child, and parent are arbitrary 
relations over family -member. For instance, we can create a situation in which A 
is the child of B but B is not the parent of A. The rest of the processes of repre- 
sentation design attempt to maximally constrain this representation while preserving 
completeness. 



Chapter 4 



Classification 



We have defined a representation to be a mapping from concepts to structures that 
enforce constraints. The representation design system captures constraints on a con- 
cept by representing it in terms of a library structure enforcing those constraints. 
Part of the system's library of structures is shown in Figure 4.1. Concepts are repre- 
sented by instances of library structures. For example, married is represented as an 
instance of relation: 

married: relation (family-member, family-member). 
The system tries to capture as many of a concept's constraints as possible by rep- 
resenting it with the most specialized structure (or structures) available, i.e., the 
available structures that together enforce the most constraints on the concept with- 
out enforcing constraints that are not true of the concept. 

The most specialized structures for representing a concept are identified by classifying 
the concept in a hierarchy of the library structures (again see Figure 4.1). The 
definitions of the property names appearing in Figure 4.1 are given in Figure 4.2. If we 
ignore the node labels, the figure depicts a standard hierarchy: The nodes represent 
classes of concepts, with more specialized classes (containing concepts with more 
properties) appearing below less specialized classes. For example, the node appearing 
below bin-rel on the fink labeled "symmetric" denotes the class of symmetric binary 
relations, a specialization of the class of binary relations. 

Some nodes in the hierarchy have several unlabeled links coming into them. These 
denote specialized classes of concepts that are the intersections of the classes linked 
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Figure 4.1: Part of the structure library. 
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Property 


Definition 


reflexive 


Vx R(x,x) 


irreflexive 


\/x -<R(x, x) 


symmetric 


VxVy[R(x,y)=S>R{y,x)] 


antisymmetric 


\/x\/y[R(x, y)Ax^y=> ^R{y, x)\ 


transitive 


VxVyVz[R(x,y)AR{y,z) =*► R(x,z)} 


antitrans 


VxVyVz[R(x,y)AR(y,z)=>- ^R{x,z)} 


total 


Vx3y y = f(x) 


1-1 


\fxVy{f(x) = f(y) ^ x = y] 



Figure 4.2: The definitions of the properties labeling links in Figure 4.1. 
(Properties not defined here are determined by inspection.) 



from above. 

Nodes are labeled with library structures. The structure labeling a node captures the 
properties of the node's class. Placing a concept in a class labeled by a structure is 
how the system identifies that structure for representing the concept. For example, 
the node labeled sym-rel denotes the class of symmetric binary relations; the library 
structure sym-rel enforces symmetry. By classifying a relation as symmetric, the 
system identifies a structure for representing that relation which captures symmetry. 

Nodes are included in the hierarchy for one of three reasons. We have already ex- 
plained the first reason: a node is included for a class when the system knows a 
specialized structure for representing concepts in the class. Representations con- 
taining specialized structures are better because fewer situations can be expressed 
in them. Therefore, a problem solver using a specialized representation explores a 
smaller search space. For example, when the concept F is represented as a function, 
problem situations can be expressed that have multiple domain elements mapping to 
the same range element. When F is represented as a 1-1 function, it is more con- 
strained because the legal situations in the new representation are a subset of those 
in the previous representation, i.e., those situations in which domain elements map 
to unique range elements. 

The second reason that nodes are included in the hierarchy is that some structures 
are included in the library because they capture a combination of constraints. This 
applies to nodes denoting the intersection of classes. Structures capturing a combina- 
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tion of constraints are included in the library only when they exploit the combination 
to gain efficiency. For instance, the class denoting equivalence relations is included 
because there is a library structure eq-rel which captures the combination of reflexiv- 
ity, symmetry, and transitivity. Eq-rel is included in the library because it captures 
these constraints more efficiently than the combination of ref-rel, sym-rel, and 
trans-rel. A more in depth argument for when to include structures in the library 
is presented in section 4.2. 

The third reason that nodes are included in the hierarchy has to do with concept 
introduction: An unlabeled node is only included when the system has rules for 
introducing a new concept allowing it to represent an existing concept differently. 
This is discussed in Chapter 5. 

The system also designs representations for sorts. While concepts are represented as 
instances, sorts are represented as data types which are defined as subtypes of library 
structures. For example, the representation of family-member is defined as 

(deftype family-member specializes individual disjoint), 
where individual is the library structure for representing domain individu- 
als. Just as the sort family-member is a subsort of the problem domain, the 
type family-member is defined to be a subtype of individual. Instances of 
family-member are used to represent individuals like N from the FAMILIES problem. 

The system treats sorts in a special way, classifying them in a separate hierarchy 
(shown in Figure 4.3). Classifying sorts increases the kinds of constraints that can 
be captured by classification. The structures in the sort hierarchy enforce constraints 
common to all individuals of a sort as well as constraints between individuals of a 
sort. Axiom schemas giving the meaning of each of the constraint names used in 
Figure 4.3 are given in Figure 4.4 

One way to view classification is as a process of identifying useful constraints in a 
problem. This gives the system a way of directing a search for interesting constraints, 
saying that they- are those that library structures can capture. Without such a 
technique the representation design process is faced with an unstructured collection 
of statements, not knowing which are important for design or which it should try to 
capture first. 
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Figure 4.3: The hierarchy used to classify sorts. 



minimum size 


Vz : 53yi, . . . , y n [{y u ...,y n }Qx] 


limited size 


\tx: S3y 1 ,...,y n [x C {y l} . . . ,y n }} 


empty 


3x : S[x = 0] 


disjoint 


\/x : 5Vy : SVz[z ExAzEy=>x — y] 



(The symbol S in these schemas is the sort being classified.) 

Figure 4.4: Axiom schemas defining the meaning of the constraint names used in the 
sort hierarchy. 
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Classification can be used to assist in the acquisition of missing information. The 
constraints that it asks about are assumed to be important enough in representation 
design and applicable to a sufficiently wide variety of problems that they are worth 
asking about when they are not found in a problem statement. This assumption is 
discussed below in section 4.2. 

Classification can not capture all possible kinds of constraints. Intuitively, we can 
divide possible constraints into those that constrain a concept in terms of itself and 
those that constrain a concept in terms of other concepts. An example of the first 
type is, 

\/xVy\married{x,y) => married(y,x)} 
and an example of the second type is, 

VxVyVz[y € parents(x) A z E parents(x) A y ^ z =» married(y, z)}, 
i.e., "two different parents of an individual are married." 

Classification can only capture constraints of the first kind because it classifies con- 
cepts individually. Similarly, sort classification can only capture constraints between 
individuals in a sort. For example, one sort introduced during design of a representa- 
tion for FAMILIES is parent-set. Classification can capture the constraint, "parent 
sets are disjoint from each other," which is a constraint between individuals in the 
sort. It can not capture a constraint like, "every parent set is a couple,"' which is a 
constraint between sorts. 1 

Operationalization can capture both types of constraints. This is one reason it is 
done after classification. 



4.1 The Classification Process 

Before classification begins, all problem statements are converted to implication nor- 
mal form. This is the following 



Note, however, the system captures taxonomic relationships between sorts as it builds up a sort 
structure during design. This is described i-n section 5.1.1. 
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<f>l => ■■•(4>n => (Vl A ••■ /V0 m )), 

where each of the <pi and i\ are literals. Note that this is equivalent to 

(0! A--- Ad„) =±- (Vi A •■■Atb m ) 
and we shall often use this latter form in our examples even though the system 
is actually using the former. 

The main loop in classification systematically selects concepts included in the initial 
representation and tries to identify properties in an effort to "push" down into the 
concept hierarchy. It uses a concept's definition to determine where to begin. For 
example, classification of married begins at the relation node. 

When classification reaches a node in the hierarchy, the system looks at the properties 
labeling links leaving that node and tries to determine which properties the concept 
has. For example, the classification of married begins with the system looking at the 
links leaving the relation node, trying to determine if married is unary, binary, or 
n-ary. 

The hierarchy is also annotated to show which properties leaving a node are disjoint 
(indicated by the arcs connecting links in Figure 4.1 and Figure 4.3). For example, 
unary is annotated as disjoint from binary. The system tries to identify one property 
from each group of disjoint properties leaving a node. For example, when classifying 
a relation at the bin-rel node, the system tries to determine whether the relation is 
transitive or antitransitive, reflexive or irreflexive, etc. At the same time, the system 
tries to find counter examples to properties labeling each of the links. 

When the system is unable to identify one property from a disjoint group or to 
demonstrate a counter example for each property in the group, it asks the user about 
the properties. For example, if the system is unable to determine whether a binary 
relation is transitive, antitransitive, or to find counter examples for both, it asks the 
user about each property in turn. 

Classification at a node is complete when the system has examined each disjoint 
group of properties at that node and either it has identified one property from the 
group or it has determined that the concept being classified has none of the prop- 
erties in that group. Classification then continues with each of the nodes below the 
properties identified. For example, after married is classified as binary, classification 
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continues with the node labeled bin-rel; when the system determines that married 
is irreflexive, symmetric, and antitransitive, classification completes at this node. 

Classification handles unlabeled links differently. When classification completes at a 
node with unlabeled links leaving it, those links are marked traversable. Classification 
does not proceed down an unlabeled link until all other links entering the node 
below are marked traversable. For example, classification will only proceed from the 
node labeled sym-rel to eq-rel when the links leaving the nodes labeled sym-rel, 
ref-rel, and trans-rel have all been marked traversable. 

When classification of a concept reaches a node that has a library structure associ- 
ated with it, the representation of the concept is redefined in terms of that library 
structure. For example, when married is classified as irreflexive, its representation is 
redefined as 

married: irr-rel(family-raember, family -member). 

A concept can be placed in multiple classes, each of which may be labeled. When this 
occurs the system defines a new data type to represent the concept. The new data 
type is a mixture of the data types labeling each node. For example, if a relation is 
classified as symmetric and reflexive but is neither transitive nor antitransitive, the 
system defines a new data type to represent that relation. The new data type is a 
mixture of ref-rel and sym-rel. All of the library structures have been defined so 
that they are mixable in this fashion. This is standard practice in object oriented 
programming so the details are omitted here. We will use the convention of naming 
data types that are mixtures by appending the names of the component data types. 
For example, the data type that is a mixture of ref-rel and sym-rel will be named 
ref-sym-rel (which is equivalent to sym-ref -rel). 

Sorts are classified by the algorithm described above, operating instead in the sort 
hierarchy. The sorts that a concept is defined over are classified before that con- 
cept is classified. The reason for this has to do with concept introduction. As will 
be explained in Chapter 5, new sorts can get introduced during the classification of 
existing sorts. Often when a new sort is introduced, the concepts defined over an 
existing sort get replaced by new concepts. In this case, any effort expended clas- 
sifying the existing concepts is wasted. The converse situation does not occur, i.e., 
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the classification of a concept may introduce new concepts but never introduces sorts 
that replace existing sorts. Therefore, the system waits to classify a concept until 
after its sorts have been classified. 



4.1.1 Answering Questions Posed by Classification 

A "brute force" approach to answering questions posed by classification would be to 
construct a statement of the property in question and then to use a theorem prover 
to try to show that the statement follows from the problem statement. For example, 
to determine whether married is symmetric, we could construct the statement, 

\/jc\/y{married(x,y) =■ married(y,x)}, 
and then instruct a theorem prover to determine whether this statement follows from 
the problem. 

However, recall that analytical reasoning problems are usually missing information. 
This means that a problem statement may not contain information about a property, 
either because it is not relevant to solving the problem or because the information 
is simply missing. This is problematic for the brute force approach. If we turn a 
complete theorem prover loose on some proof attempt, it may never halt and we can 
never know whether this is because we have not given it enough time or the problem 
statement is incomplete. 

The representation design system constructs a statement of the property and then 
uses a semi-decision procedure to try to prove that it follows. The semi-decision pro- 
cedure is implemented as a combination of three mechanisms: a rewrite system that 
simplifies problem statements, a matcher that recognizes some syntactic variations in 
the patterns it is matching, and mechanism that attempts to find a counter example 
to a general property that the system is looking for. When this procedure halts and 
reports a failure to determine the status of some property, the system assumes that 
the information is missing from the problem and asks the user about it. 

The advantage of this approach is that it avoids asking the user most questions that 
would be considered obvious, and at the same time avoids getting bogged down trying 
to prove that some constraint holds when a problem statement may not contain the 
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information. 

The approach is implemented as follows. First, as problem statements are added to 
the system's description, they are simplified by the rewrite system. The intention of 
this step is to partially "canonicalize" statements, i.e., rewrite them so they will be 
recognized by classification. 

The rewrite system is implemented in the usual way, as a collection of rules that match 
and replace patterns in statements (see appendix A for a more in depth discussion of 
the rewrite system). The current body of rules mainly exploits properties of sets to 
simplify statements. One of the rules is paraphrased as follows: 

If a statement is a conjunction of expressions, two of which are of the form 
x e S and y G S and a third expression in the conjunction has the form x £ y, 
then replace the first two expressions by {x,y} C S. 

This rule will, for example, rewrite the statement 

'\/x3y3z[y e parents{x) A:6 parents(x) A y ^ z] 
as 

\/x3y3z[{y, z} C parents(x) A y ^ z\. 

Both statements express the fact that every individual has at least two parents, but 
classification will recognize only the second statement as the "min size" constraint 
(Figure 4.4) on members of the sort parent-set (sets of parents of the same individual). 

To check the problem for a property, the system generates a statement expressing that 
property from the constraint axiom schema by substituting the appropriate concept 
for the concept symbol in the schema. All properties in the hierarchies have schemas 
associated with them (unless they can be checked by inspection). Figure 4.2 gives 
the schemas for properties in the concept hierarchy. 

The schemas for properties in the sort hierarchy are given in Figure 4.4. A first order 
statement is constructed from these schemas in two steps. First, the name of the 
sort being classified is substituted for the sort symbol in the schema. Second, when 
the sort was defined by the system as a sort for the range elements of a function, 
each sorted variable in the schema is replaced by an application of that function. For 
example, recall that the axiom schema for the minimum size constraint is 
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Vz :S3y 1 ,...,y n [{y 1 ,...,y n } C x\. 
The sort parent-set is defined by the system when the function parents is intro- 
duced. When classifying this sort, 5 is replaced by parent-set and all occurrences of 
x are replaced by parents(x), yielding 

Vz : parent-set3y 1 ,. . . , y n [{y u ...,y n }C parents! x)]. 

When this statement is matched against problem statements, y l: ...,y n will match 
any sequence of variables, allowing the system to check for a constant set of any size. 

The matcher allows statements that are syntactic variations of each other to match. 
One type of variation that it handles occur with commutative, associative connectives. 
For example, the matcher treats A and V as commutative, associative connectives so 
that two statements will match if changing the order of a group of conjuncts in one of 
the statements makes it unify with the other statement. For example, the following 
two statements match: 

VxVy[P{x,y) A (Q(x,y) A R(x,y))\ 
VuW[(R(u,v) A P(u,v)) A Q(u,v)}. 



4.1.2 An Example of Classification 

We demonstrate the classification of married as it is performed by the system when 
given the problem used in our running summary. The purpose of this example is to 
illustrate the classification process and the knowledge acquisition behavior that the 
system exhibits. The problem statement modified by processes of Chapter 3 is shown 
in Figure 4.5. Note that the system has converted these statements to implication 
normal form. However, in our examples, we will leave statements in their original 
form so long as their translation to implication normal form is straightforward. 

Before married can be classified, family -member must be classified in the sort hier- 
archy. Since family-member is represented as 

(deftype family-member specializes individual), 
classification begins with the individual node in Figure 4.3. The first question is 
whether the names mentioned in the problem statement refer to unique individuals. 
To determine this, the system looks for disequalities between all the individuals. The 
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P : family-member, Q : family-member, 
R : family-member, S : family-member 
grandchild(Q, S) 

Vx child(P, x) <=> x = R 

married(Q, P) 

\/x\/y[grandchild(x,y) <& 3z(child(x,z) A child{z,y))\ 

\/x\/y[child(x, y) ■£> parent(y, x)\ 

\/xiyVc[child(x, c) A child(y, c)Ai^t/^ married(x,y)\ 

VxiyVc[married{x,y) A child(x,c) => child(y,c)} 

Query: find- all x: parent(S,x) 

Figure 4.5: State of the example problem at the end of the Chapter 3. 



problem statement does not contain any, so the system asks the following question: 

Are P, Q, R, and S all different individuals 7 
Yes. 

All the analytical reasoning problems I have studied make the unstated assumption 

that individuals with different names are different. However, this could easily not be 

the case, so the system explicitly asks about this assumption. Since, in this case, the 

answer is "yes," family -member is specialized as 

(deftype family-member specializes unique-individual). 
The structure unique-individual represents a sort of individuals with the prop- 
erty that individuals with different names are different. This is implemented by an 
equality procedure associated with unique-individual that simply compares indi- 
vidual's names. 

The next question posed by classification is whether the problem statement mentions 

all the individuals of sort family-member. The system knows no way of checking 

the problem statement to see whether this is true, so it asks the question 

Are P, Q, R, and S all the family members? 
Don't Know. 

Note that, as in this example, the system allows the user to answer questions with 

"don't know" when determining the answer requires solving the problem, e.g., one 

can not tell whether or not all the family members are known until very late in the 

process of solving the FAMILIES problem. The system always treats such an answer 

as a "no" response. 
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Classification of family -member now terminates. The system proceeds with the 

classification of married. It is classified as binary by inspecting its representation. 

Next the system tries to determine whether it is reflexive or irreflexive. It looks for 

statements of these properties in the problem, and looks for counter examples. None 

are found so the user is queried: 

Is MARRIED reflexive? No. 
Is MARRIED irreflexive? Yes. 

In response to this answer, married is specialized as 

married: irr-rel (family-member, family-member) 
and the statement \/x^married{x,x) is added to the problem. 

Then the system turns to the question of whether married is symmetric or antisym- 
metric. Again the user is queried: 

Is MARRIED symmetric? Yes. 
In response to this answer, married is further specialized as 2 

married: sym-irr-rel (family-member, family-member) 
and the following statement is added to the problem: 

Vx\/y[married(x,y) => married(y,x)}. 

Classification now proceeds to the question of transitivity and, again, the problem 
statement provides no assistance, so the user is asked: 

Is MARRIED transitive? No. 
Is it antitransitive? Yes. 

Married is now specialized as 

married: ant itrans-sym-irr-rel (family-member, family-member) 
and the following statement is added to the problem, 

Vx\/y\/z[married(x,y) A married(y, z) => ~^married(x, z)]. 

This classification effort is now complete; it has reached the leaf node denoting ir- 
reflexive, symmetric, antitransitive relations (shown shaded in Figure 4.6). Since 
there is no structure associated with this node, the representation of married re- 
mains as shown above. However, as we will see in Chapter 5, this node is treated 



Recall that sym-irr-rel is actually admixture of the structures sym-rel and irr-rel. 
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specially by concept introduction. The problem statement resulting from this classi- 
fication is shown in Figure 4.7 with the added statements enclosed in a box. 



FUNCTIO 



REL. 



n-ary 




EQ-REU 



Figure 4.6: The node reached in classifying married. 

The specialized representation designed so far is: 

(deftype family-member specializes unique-individual) 
married: ant it rans-sym-irr-rel (family -member, family-member) 
child: relation (family-member, family-member) 
parent: relation(family-member , family -member). 

4.2 Assumptions About Library Structures 



For the library to be useful, the properties found in it should appear in a wide variety 
of problems and enforcing them as constraints should provide significant leverage in 
problem solving. In this case, classification becomes a technique for recognizing when 
-a problem contains properties that the representation design system knows efficient 
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P : family-member, Q : f amily-member , 
R : family-member, S : family-member 
grandchild(Q, S) 

\/x child(P,x) <=> x - R 

married(Q,P) 

\/xiy[grandchild(x,y) <£> 3z(child(x,z) A child(z,y))\ 

VxVy[child(x,y) O parent(y,x)] 

\Ixiyic{child(x, c) A child(y, c) A x ^ y =£> married(x.y)} 

\/x\/y\/c[married(x,y) A child(x,c) => child(y, c)} 



yx^married(x, x) 

Vx\/y[married(x,y) <^ married(y, x)} 
\fx\/yiz[married{x,y) A marriediy, z) => -imarried(x,z) ] > 



Query: find-all x: parent{S,x) 

Figure 4.7: The small FAMILIES problem after married has been classified. 

ways of capturing. It assumes that certain properties are worth looking for in a 
problem because of the leverage obtained in specialized representations that enforce 
those constraints. For example, it assumes that it is worth trying to determine when 
the sets of the same sort are disjoint from each other because the system has efficient 
ways of capturing disjointness. 

There is at least one rule of thumb in searching for structures that are widely applica- 
ble and provide significant leverage: the more general the properties that a structure 
captures the wider its applicability. However, it is usually the case that the more 
general a structure is the less leverage it provides in representation design. One of 
the challenges in finding the "right" collection of structures to look for is trading off 
generality against usefulness. 

In this research, useful structures were found by studying the representations that 
people use to solve analytical reasoning problems. I looked for structures that peo- 
ple commonly use in their representations and I identified the properties that those 
structures capture. The current library population is the result of this investiga- 
tion conducted with twenty analytical reasoning problems. This investigation was 
discussed in section 1.5.1. 

In retrospect, the structures that people use turn out to be constraints (or combi- 
nations .of constraints) having particularly efficients-implementations. Consider an 
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example of how much leverage the library structures afford because of this. One 
specialization of set in the sort hierarchy is disjoint-set. Set provides an equal- 
ity procedure for sets of the same sort which reports that two sets are equal if and 
only if all elements of both sets are known and they are the same. Disjoint-set 
provides a more efficient equality procedure that exploits the additional constraint 
that the sets of a sort are disjoint: it reports true if it can show that two sets share 
any members. Thus, it is more efficient in two ways: it does not necessarily have to 
check all members of the sets its comparing and it can work even if some members 
of those sets are unknown. 

It may be that the particular collection of structures in the current library prove to 
be less applicable as different problems are investigated. However, notice that the 
structures in the library are similar to concepts that have been important in math- 
ematics for a long time. I did not build the library by attempting to replicate what 
mathematicians consider important. Instead, I tried to capture what I found people 
using in representations. The fact that I ended up with a collection of structures sim- 
ilar to those that mathematicians consider important reinforces the likelihood that 
these have fairly broad applicability. 

The issue of whether the current collection of structures will provide significant lever- 
age in a wide variety of problems is empirical. So far, the only substantive claim I 
can make is that they do provide significant leverage in designing representations for 
twenty analytical reasoning problems. 

Assuming that the library structures provide significant leverage and are widely ap- 
plicable, classification can be viewed as the following useful knowledge acquisition 
heuristic: 

"Properties of library structure are useful enough that if they can not be 
identified in a problem, they are worth asking about." 

This is only a heuristic because it can cause the system to acquire information that 

is irrelevant to solving a problem and this, in turn, can cause it to over-design a 

representation. 
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4.3 Capture Verification 

The system designs representations for concepts in which constraints on those con- 
cepts are captured. An important consequence of a constraint being captured in a 
representation is that it is guaranteed to be true in any situation created in that 
representation. In fact, the following stronger statement is also true: a constraint is 
captured in a representation if and only if it is true in every situation that can be 
created with that representation. This section shows how this fact has been turned 
into a test, enabling the system to use a representation directly to determine whether 
a statement is captured in it. This test is called capture verification. The section also 
explains why such statements can be removed from the system's consideration. 

4.3.1 How Capture Verification Works 

The basic idea of capture verification is to prove that a statement is captured in a 
representation by demonstrating that it is true in every situation that can be created 
in that representation. This is a constructive proof technique: A representation is 
used to build situations, i.e., collections of data structures, and then those situations 
are inspected to determine if the consequences of the statement are true. The reason 
this test is useful is that the system does not really have to check every possible 
situation to verify that a statement is captured. In fact, it only has to check one 
"minimal" situation. This is guaranteed by requirements imposed on library types 
which are described in section 4.3.3. 

Consider an example of capture verification: Suppose the relation couple defines a 
sort whose members are married couples (i.e., couple(x) is true if £ is a set of size 
two whose elements are family members married to each other). Suppose further 
that this sort, call it married-couple, has been classified as having fixed sized disjoint 
members and, consequently, is represented as 

(deftype married-couple specializes 

fixed-size-dis joint-set (2, family-member)). 
Finally suppose a problem containing couple and married-couple also contains the 

statement 

: Vx\fy^z[couple({x,y}) Aii t/.A couple({x,z}) Ai^;^y = :j. 
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Capture verification proves that this statement is captured by constructing a situation 
representing the antecedent of the statement and then checking that situation to 
see if the consequent is true. First it creates two anonymous family members by 
instantiating the family-member ADT without giving names for the instances. In 
this exposition we will call these instances x and y. After creating these, the system 
assumes that they are disequal by adding the statement x ^ y to the situation. An 
operation corresponding to adding this disequality is supported by an equality system 
(discussed in Chapter 7) that is part of all specialized representations. 

Next the system creates an instance of married-couple whose elements are x and y. 
It then performs similar steps to create z, assume that x ^ z, and create a married 
couple containing x and z. When the second instance of married- couple is created, 
a procedure associated with married- couple enforces the disjointness property by 
combining the two separate instances (since they share x and, therefore, must be the 
same). Then another procedure enforces the fixed size of members of married-couple 
by combining the instances y and z (this is true because neither y nor z is equal to 
x and because the married- couple may contain only two individuals). 

When capture verification checks the situation created, it finds the consequent of the 
above statement to be true and concludes that the statement is captured. 

In general, capture verification can directly check statements in implication normal 
form by creating a situation representing the conjunction of antecedents and then 
checking that all the consequents are true in that situation. 

A conjunction of atomic formulas is considered to be in this form, i.e., n can be 
0. To test a general statement in this form, capture verification creates anonymous 
individuals for each universally quantified variable mentioned in the statement and 
then tests whether the statement is true in the situation containing those individuals. 
For example, suppose that child-set is a sort whose members are sets of individuals 
that are children of the same couple and that child-set-of is a function mapping an 
individual to the set of children he/she is a member of, i.e., 

V-cVt/jx 6 child-set-of(y) •£> 3z(child(z,x) A child(z,y))}. 

Further suppose that we have designed representations child-set-of and 
child-set. Then the statement, 
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\/x[x 6 child-set-of(x)}, 
is tested as follows. First we create an anonymous individual x and then we cre- 
ate an instance of child-set, call it y. Next, we make y be the image of x under 
child-set-of . Finally, we check to see if x is a member of y. 

4.3.2 How Capture Verification is Used 

As representations are designed, constraints expressed in problem statements get 
captured by those representations. Capture verification is used to identify statements 
in a problem that express captured constraints. It is run whenever classification of a 
concept terminates. The system removes statements identified by capture verification 
so that the problem statement is, at any point, an accurate record of which statements 
are not captured by a representation. 

The problem statement plays two important roles as a record of statements left to 
be captured. First, when a problem statement is empty, the system knows that all 
of the constraints of a problem are captured and, consequently, the design effort 
is complete. Second, concept introduction creates alternative problem formulations 
which are compared. The comparison process relies on the alternative formulations 
of the problem statement being accurate records of which statements are uncaptured 
in each. 

4.3.3 Why Capture Verification Works 

This section answers two questions: Why is is sufficient for capture verification to 
create and check only minimal situations? And why is it sufficient for it to check only 
one minimal situation? The reason that only minimal situations must be checked is 
that library structures are required to be monotonic. When a structure is monotonic, 
specific facts that are true in a situation created with that structure remain true no 
matter what information is subsequently added. 

As an example of a structure that violates monotonicity, suppose relation was 
implemented as one list of n-tuples with the absence of some n-tuple being inter- 
preted as negation, e.g., the absence of < A, B > from the married list meaning 
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-imarried(A, B). In this implementation relation is non-monotonic because adding 
the pair < A, B > changes the truth value of married(A, B). 

Capture verification does not have to check "larger" situations because monotonicity 
guarantees that such situations do not change the truth value of the facts in a minimal 
situation. 

One reason that capture verification need only check one minimal situation is that 
all library structures are required to be logically sound. When a structure is logically 
sound, every consequence of its invariant axioms is true in situations created with the 
structure. For example, sym-rel is sound because every logical consequence of the 
symmetry axiom is true in situations created with it. In particular, if R is represented 
as a symmetric relation, then not only is it the case that R(A,B) is true just in case 
R(B,A) is, but also it is the case that -^R(A,B) is true just in case -^R(B,A) is. 

Obviously, more than one minimal situation can be constructed for any conditional 
statement. For example, to check P =$■ Q, we can construct a situation containing 
P and then check for Q, or we can construct a situation containing -># and check 
for -if. However, the logical soundness of library structures guarantees that capture 
verification need only check one of these. 

The other reason that capture verification need only check one minimal situation 
has to do with checking general statements. If general statements were checked in 
a brute force way, capture verification would be impractical for them. There are a 
large number (often an infinite number) of minimal situations that can be created to 
check a general statement, each differing only in the specific individuals mentioned. 
The well known law of generalization of constants for first order predicate calculus 
allows capture verification to check one situation containing anonymous individuals 
and infer that any situation differing only in the specific individuals mentioned will 
be the same. 

The law of generalization [Bell & Machover 77, p.65] states that if it is possible to 
prove a theorem of the form a(x/cf from a set of formulas $ and the constant c 
does not occur in $ or a, then $ j= \/x[a\. Introducing new anonymous instances 
is equivalent to introducing previously unmentioned constants, thus any conclusion 

The notion afiHct rrreans c is substituted for every occurrence of x in a. 
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that capture verification draws from a situation created with anonymous individuals 
will be true for all situations differing only in the specific individuals mentioned. 

Monotonicity also has important consequences for the design process: 

1. If a representation captures a set of statements, then any specialization will 
capture those statements. 

2. If a representation captures a set of statements, then it captures every subset 
as well. 

The combination of these properties means that any subset of the statements in a 
problem can be tested by capture verification at any time and if they are captured 
in the representation, subsequent representation design activity will not change this. 
Therefore, once capture verification indicates that a statement is captured it can be 
removed from the consideration of subsequent representation design. 

If monotonicity is relaxed, this ceases to be the case and statements must be con- 
tinually rechecked as representation design proceeds. Furthermore, there would no 
longer be any guarantee that the representation design process moved monotonically 
towards more fully constrained representations since redefining the representation of 
a concept could cause captured constraints to become uncaptured. 

4.3.4 Interactions Between Knowledge Acquisition and 
Capture Verification 

In our example problem, classifying married results in statements of its properties 
being added to the problem. Then, when classification of married terminates, cap- 
ture verification removes these statements. One might think that these two processes 
are self defeating, i.e., one might wonder why these constraints are added simply 
to be immediately removed by capture verification. The reason is that concept in- 
troduction (the subject of Chapter 5) constructs equivalent alternative problem for- 
mulations. Statements captured in one formulation are not necessarily captured in 
another. Even though a statement acquired to specialize one representation will be 
captured by that representation; the-statement may not be captured in an alternative 
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representation. Such statements are included to ensure that alternative formulations 
will be equivalent. 

It is the function of knowledge acquisition to attempt to ensure that all relevant 
constraints appear in the problem, while the function of capture verification is to 
ensure that constraints are expressed in the problem statement or captured in the 
representation but not both. 

The combination of these processes designs representations correctly for each of the 
following: 

1. Cases where constraints that classification looks for are left out of a problem. 

2. Cases where classification finds the constraints it is looking for in the problem. 

3. Cases where constraints that classification looks for are in the problem but it 
does not recognize them. 

The first case is illustrated by our example problem. The system acquires missing 
constraints, designs a representation to capture them, and then verifies that those 
constraints are indeed captured. 

The second case would occur, for instance, if our example problem contained the 
statement 

yx\/y\married[x,y) =±> married(y.x)}. 
In this case, the system would not ask the user about the symmetry of married 
and this statement would be removed by capture verification after the classification 
of married. 

The third case occurs because the techniques that classification uses to find a state- 
ment expressing a constraint are necessarily incomplete. In this case, knowledge 
acquisition will add redundant information to the problem. Capture verification, 
while also incomplete, is more powerful than the technique that classification uses, 
often identifying captured constraints that classification misses. 
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4.4 Summary 

The representation design system captures constraints on a concept by representing 
it in terms of a library structure enforcing those constraints. The best structure (or 
set of structures) for representing a concept captures the most constraints on the 
concept without capturing constraints that are not true of the concept. Given a 
fixed vocabulary (set of concepts and sorts), classification identifies the best available 
library structures for representing each concept (and sort) in that vocabulary. 

For classification, the library structures are organized into two hierarchies: one con- 
taining structures for representing concepts (Figure 4.1) and the other containing 
structures for representing sorts (Figure 4.3). Structures in the concept hierarchy 
capture constraints on concepts, for example, function captures single valuedness 
(i.e., x = y =s- f(x) = f{y)). Structures in the sort hierarchy capture constraints 
between or common to the members of a sort, for example, fixed-size-set is used 
to represent a sort whose members are sets and captures the constraint that all the 
sets are the same size. 

Classification does knowledge acquisition, assuming that if the constraints it is looking 
for are not found in the problem statement, then they are worth asking the user about. 
When the user says that a concept has some property, the system adds a statement 
to the problem expressing that fact. Thus, classification often results in an expansion 
of the problem statement. The example given in this chapter is the classification of 
married in the small FAMILIES problem. The result is shown in Figure 4.7. The 
statements added during the classification of married are enclosed in the box. 

Classifying married also results in a the following specialized representation for it: 

married: antitrans-sym-irr-rel(f amily-member, family-member) . 

Capture verification is a constructive proof technique which uses a representation 
directly to determine if a constraint is captured in it. The technique relies on the 
following fact: A constraint is captured in a representation if and only if it is true 
in every situation that can be created with that representation. The technique is 
used to identify the statements that a classification effort has captured. The sys- 
tem removes captured statements from its problem description so that subsequent 
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Chapter 5 

Concept Introduction 



Concept introduction extends the classification process by adding new concepts to 
a problem. The system introduces new concepts to explore the classification of al- 
ternative problem formulations. A new concept is introduced by defining it in terms 
of existing concepts in such a way that the semantics of a problem are not changed. 
When a new concept is introduced, a new representation is also introduced, giving the 
system access to different parts of the structure library, i.e., the new representation 
captures different constraints and has different specializations. 

Gaining access to a different part of the structure library provides the opportunity 
to capture more problem constraints or to capture the same constraints more effi- 
ciently. Consider, for example, the introduction of parents during the design of the 
representation of the small FAMILIES problem: Parents is defined in terms of child 
as 

\/x\/y[x € parents(y) <=> child(x,y)}. 
This definition is equivalent to the more familiar 

parents — \y.{x | child(x,y)}. 
However, as will be explained, the system requires new concepts definitions to be 
directly usable in reformulation. The first formula above has this property, while the 
second does not. 

Introducing parents allows the system to view the concept "child" as a function into 
sets instead of as a relation, i.e., the new symbol parents is a function from family 
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members to sets of their parents. When parents is introduced, a representation is 
introduced for it and for its range: parents is defined in terms of function and 
parent-set is defined as a subtype of set (family-member). 

After introducing a new concept, the system creates a reformulation of the problem in 
terms of it. Alternative formulations are constructed by using the logical definition of 
a new concept to rewrite existing statements. For example, the definition of parents is 
treated as a rewrite rule to construct a formulation of the problem in terms of parents 
by rewriting every occurrence of child to parents. For instance, one statement that 
is rewritten in the small FAMILIES problem is 

Vx child(P.x) ■=■ x = R. 
The formulation in terms of parents contains the equivalent statement 

ix P G parents(x) v4> x = R. 

One fact in the formulation of small FAMILIES in terms of child is that every in- 
dividual is a child of exactly two other individuals. In the formulation in terms of 
parents this fact becomes, "all sets of parents have exactly two members which allows 
classification to specialize parent -set in terms of fixed-size-set. 

The parents representation, along with the specialized version of parent-set, cap- 
tures the size constraint, a constraint that the system is not able to capture by 
specializing child because no specialization of relation captures size constraints. 
However, parents alone leaves some of the constraints uncaptured that child does 
capture. 

Alternative formulations of a problem are compared based on how efficiently the 
problem constraints can be captured in them. The cost of a formulation is the cost 
of the concepts in it. The cost of a concept is the cost of the machinery capturing 
the constraints on that concept. 

Consider an example of comparing two formulations. After introducing parents, the 
system compares the formulation in terms of it to the formulation in terms of child. 
It turns out that the cost of the machinery capturing the constraints on parents is 
less than the cost of the machinery for child. Since the statements mentioning other 
concepts in these two formulations are the same, the cost of the other concepts is 
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the same. Therefore, the cost of the formulation in terms of parents is less than the 
formulation in terms of married and the former formulation is preferred. 

There are times when the system introduces alternative concepts and then decides 
that representations should be included for more than one alternative, i.e., the pre- 
ferred formulation contains more than one alternative concept. For example, during 
the design of a representation for FAMILIES, an alternative for parents is introduced 
which is a function mapping a family member to his/her set of children. Let us call 
this function children. After a comparison analysis, it is decided that the problem 
representation should contain parents and children (but not child). From a suf- 
ficiency point of view only one of these concepts is necessary, however, it turns out 
that the representation containing both captures the problem's constraints more effi- 
ciently than a representation of either one alone. Thus, the system has decided that 
for this problem it is best to use multiple representations of the "child" concept. 

There are also times when the system introduces alternative concepts and is forced 
to keep more than one equivalent concept. This happens when it is not able to 
reformulate a problem entirely in terms of an introduced concept. 

Concepts are also introduced to reformulate mixed statements that are restrictions 
to specific statements about sets. The motivation for this has to do with the desire 
to design representations for problem classes. If a problem contains a statement 
restricting the number of individuals that can stand in some relation to a specific 
individual, we would like the system to design a representation that is not dependent 
on the number of individuals in the given statement. This way the representation can 
be used for another problem that has a similar constraint, but differs in the number of 
individuals involved. For example, we want the representation designed for a problem 
containing the statement 

Vx\child(P.x) <^ x — R\ 
to also work for a similar problem containing 

Vx[child(P,x) *> x = R V x = 5], 
and so on. 

The system accomplishes this type of generalization by reformulating restrictions 
to create statements about sets and then generalizing those statements to involve 



105 



arbitrary sets. For instance, to accomplish this in the above example, the system 

introduces the children function and reformulates. The reformulated versions of the 

two statements above are 

3yi3zi[children(y l ) = {zi}] 
3y 2 3z 2 3w 2 {children(y 2 ) = {z 2 ,w 2 }] 

which the system generalizes to 

3y3z children(y) — z, 
where z is taken to be a set. Hence, a representation is designed to allow state- 
ments restricting the set of an individual's children to a constant set of any size. 

Also, to ensure that alternatives for formulations containing mixed constraints are 
always preferred, the system assigns an infinite cost to mixed statements. 

Classification extended by introduction is called extended classification. Extended 
classification is interesting because, while the two processes involved are fairly simple, 
the behavior of the combination can result in multiple reformulations of a problem. 
Several examples of this will be given later, including the sequence of introductions 
that results from the classification of married: 



1. The concept spouses is introduced. This is a function from individuals to the 
sets of individuals to whom they are married. 

2. The concept non-empty-spouses is introduced. This is a partial function from 
individuals to the non-empty sets of individuals to whom they are married. 

3. The concept spouse is introduced. This is a partial function that captures the 
fact that individuals have at most one spouse. 

4. The concept couple is introduced. This is a partial function from individuals 
to the married couple that they are members of. This function captures the 
following facts: not all individuals are married, each married couple is disjoint 
from all other married couples, married- couples contain exactly two members. 
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5.1 Introduction Rules 

Introduction is implemented as a collection of condition-action rules that are associ- 
ated with nodes in the structure library hierarchies. A rule's condition part checks 
properties of the concept being classified. When the conditions are met, the action 
part introduces a new concept and its representation. 

Figure 5.1 gives an example of an introduction rule which is associated with the 
node in Figure 5.2 for irreflexive, symmetric, antitransitive relations. When this 
node is reached while classifying a relation, the system reformulates that relation as 
a function into sets. Given a relation R, the rule introduces a function Fr into sets 
of the form {x \ R{x,y)}. This is done to explore the use of the function and set 
representations (and their specializations) to design a more specialized representation 
than was possible for R. 

"For the relation R : Si x s 2 introduce the new concept Fr defined as 

y X Vy{y E Fr(x) ^ R(x,y)}. 
Also introduce a representation for the range of Fr as 
(deftype FR-ran specializes set(s2)) 
and introduce the representation Fr as 
Fr: f unct ion (s, FR-ran)." 

Figure 5.1: An example Introduction rule 

In general, introduction rules specify new biconditionals (usually one, occasionally 
two) that define a new concept. By convention, we will always write such a definition 
with the new concept on the left. Introduction rules define representations for the 
concepts they introduce and often also introduce sorts along with their representa- 
tions. 

The system uses the logical definition of the new concept to create a new rewrite rule 
each time the introduction rule is applied. The example rule rewrites occurrences 
of the term R(x,y) as y 6 Fr(x), where R is whatever relation is being classified 
when the introduction rule is applied. These rewrite rules create alternative problem 
formulations. 

The form of logical definitions of new concepts is restricted to allow them to be used 
directly in reformulation. Such a definition must be expressed as a biconditional 
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Figure 5.2: Node for irreflexive, symmetric, antitransitive relations. 



between conjunctions of one or more atomic formulas. When both sides of a definition 
contain only one atomic formula, it is used as a rewrite rule as already described. 
When the left hand side is a conjunction, it is treated as a rewrite rule that replaces 
the atomic formula on the right by the conjunction of atomic formulas. For example, 
the definition 

\/xiy[couple-of{x) - couple-of(y) Ai^y» married(x,y)} 
is treated as a rewrite rule that replaces atomic formulas of the form married(x,y) by 
the conjunction on the left hand side. When this rule is used to rewrite the statement 
married(N,P), the result is 

couple-of(N) = couple-of(P) A N ^ P. 

When the right hand side of a definition is a conjunction, it is treated as a con- 
ditional rewrite. For- example, consider the following definition of the function 
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non-empty-brothers: 

VxVj/[x = non-empty-br other s(y) <=> x = brothers(y) A brothers(y) ^ 0]. 

This is interpreted as follows: 

"If a statement S contains a term of the form brothers(y) and it can be shown 
from the structure of S that the term denotes a non-empty brother set, then 
replace the term by non-empty-brothers(y)." 

The intention behind the phrase, "can be shown from the structure of S" is that 

the system will only attempt to show a condition by a syntactic analysis of S. For 

example, the above rewrite rule would be applied to the statement, A £ brothers{B), 

(where A and B are constants) because the statement unconditionally asserts that 

Z?'s brother set is non-empty. 

5.1.1 Capturing Constraints Between Sorts 

Once the system introduces a sort, it attempts to capture taxonomic constraints 
between that sort and other sorts. This allows the system to capture some constraints 
between concepts, a type of constraint that classification can not capture. 

As a simple example, suppose the system has introduced the sort of the elements for 
which a relation P is true and for the sort of the elements for which a relation Q 
is true. Let us call the representations of these sorts P-dom and Q-dom respectively. 
Then it will try to capture problem statements expressing a relationship between P 
and Q as a subsumption constraint between the types P-dom and Q-dom. For example, 
it captures the statement 

Vx[P(x) => Q(x)} 

by making P-dom a subtype of Q-dom: 

(deftype P-dom specializes Q-dom). 

Through the process of introduction, the system can transform statements into a 
form recognized as a subsumption constraint and capture it in the type hierarchy 
of representations. For example, the constraint in the FAMILIES problem that the 
parents of an individual are married is initially expressed as 

VxVy\/z[child(x,z) A child(y,z) A x ^ y => rnarried(x,y)]. 
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During design the system introduces two concepts: parents and couple. The latter is 
a mapping from an individual to the couple he/she is a member of. It also introduces 
representations for the ranges of these functions: parent-set and married-couple re- 
spectively. As will be described later in this chapter, because of properties that the 
system discovers of parent-set, it introduces parent-set-of, a function mapping an 
individual to the parent-set he/she is a member of. As a result, the above statement 
is rewritten as 

VzVy\parent-set-of(x) — parent-set-of (y) => couple(x) = couple(y)}, 
which is recognized as expressing a subsumption relationship between parent-set 
and couple. 1 Therefore, it removes this statement from the problem and adds the 
following to the representation: 

(deftype parent-set specializes couple). 

Thus, through the process of introduction, the system has turned this statement, 
originally a constraint between child and married, into a subsumption constraint 
between parent-set and married-couple and captured the constraint efficiently in 
the type taxonomy of representations. 



5.2 Soundness of Introduction 

As we will see in the remainder of this chapter, many new concepts can be introduced 
during representation design. We would like to be sure that when a solution is found 
in the new formulation of a problem, it is a solution to the original problem. In other 
words, we would like to show that the introduction process is sound. 

From a model theoretic point of view, a new concept is introduced by two actions: 
adding a new constant to the language of the problem and adding a new statement 
to the problem. It is well known that there can be no more models of a set of 
statements than of any of its subsets. Therefore, since concept introduction adds a 
new statement, the new problem can not have more models than the original. To 



This statement actually contains more information than the subtype constraint. In general, 
such statements can not be removed. However, in this case, the additional information turns out to 
be captured elsewhere in the representation, allowing the system to remove the statement. 
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ensure soundness, we show that no models of the original problem are eliminated by 
concept introduction. 

This is done by ensuring that every introduction has the property that every model 
of the original problem can be extended to a model of the new problem which satisfies 
the new statement. Since the new models are extensions of models of the original 
problem, they are themselves models of the original problem. Since every model 
can be shown to have such an extension, no model of the original problem has been 
lost. When every introduction meets this restriction, the process as a whole is sound 
because every introduction step is sound. 

This restriction is checked (by hand) for each introduction rule by interpreting the 
statement that the rule introduces as an abstract procedure to perform on models of 
the original problem to get the extended models. For example, the rule in Figure 5.1 
can be shown to meet the restriction by showing how to treat the statement 

VxVy[t/€ F R (x) <& R{z,y)] 
as a procedure that extends any model of the original problem. 

Take any model of the original problem and add Fr to the set of function symbols 
of the model. Next, for each element, x in the domain of Fr, create a pair of the 
form < x,{y | R(x,y)} >. Add the union of these pairs to the domain of the model 
and designate this new set by the symbol Fr. The new model is an extension of the 
original model. 

Note that when an introduction rule is conditional, we can assume its conditions while 
checking the restriction. Then we show that any model of the original problem that 
satisfies the conditions of the rule can be extended to a model of the new statement. 

Each introduction rule in the system has been checked and shown sound in this 
fashion; it follows that the introduction process, as a whole, is sound. 



5.3 Extended Classification 

In order to present the intuition behind extended classification, we first explain the 
process under the assumption that no- knowledge acquisition is required to design 



5.3. EXTENDED CLASSIFICATION 111 

representations. How extended classification is actually done for analytical reasoning 
problems is more complicated because they are usually missing information. This 
more complicated case is discussed below. 

Concept introduction rules are attached to nodes in the library hierarchies. Extended 
classification collects the introduction rules it encounters while classifying a concept. 
When classification of the concept terminates, the system runs capture verification. 
Statements that can be captured by classification are general statements that mention 
only the concept being classified. If any statements of this form remain uncaptured 
after classification of the concept, all the introduction rules found while classifying it 
are checked. 

All the rules whose conditions are satisfied are applied and each applied rule intro- 
duces a new concept. Note that the representation of a new concept may not capture 
statements that the original concept captures. Therefore, the formulations associated 
with new concepts contain reformulated versions of all the statements mentioning the 
original concept whether or not they are captured by the representation of the origi- 
nal concept. Extended classification proceeds by classifying each of the new concepts. 
When all the classification efforts terminate and capture verification is run in each 
alternative formulation, the alternatives are compared. Note that classification of an 
alternative can cause additional concepts to be introduced. 

Consider as an example part of the extended classification of married. The intro- 
duction rule in Figure 5.1 is associated with the node for irreflexive, symmetric, 
antitransitive relations (shown shaded in Figure 5.3). 

When applied, the rule introduces the concept spouses 2 , a function from family 
members to their sets of spouses. Spouses is defined with the following statement: 

\/x\/y[y € spouses(x) <& married(x,y)]. 

A formulation of the problem in terms of spouses is generated when the concept 
is introduced. This formulation contains a new statement for every statement in 
the original formulation that mentions married, including those statements already 
captured by married (e.g., symmetry). Each new statement is logically equivalent to 



Recall that the system gives this concept a gensymed name. We have changed it to spouses for 
clarity of presentation. This is the case for the examples given throughout this chapter. 
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Figure 5.3: The node reached in classifying married. 



its counterpart in the original formulation but is expressed in terms of spouses. For 
example, the statement, 

Vx\/y[married(x, y) <=> married(y,x)\, 

has an equivalent counterpart in the new formulation: 

\fxVy{y £ spouses(x) «if spouses(y)}. 

After spouses is introduced, extended classification proceeds by classifying it (it is 
the only alternative in this example). It turns out that as spouses is being classi- 
fied, an alternative is introduced for it: spouse, a function from family members to 
their spouses. The system determines that spouse is preferable to both spouses and 
married (or their combination). Therefore, the formulation in terms of spouse alone 
is preferred. 
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5.3.1 Extended Classification in Incomplete Problems 

Analytical reasoning problems often omit information necessary to solve them. There- 
fore, when classification of a concept terminates, the system can not assume that the 
introduction rules collected need only be tried when statements are left uncaptured. 
Instead, the system uses concept introduction not only as explained so far but also to 
extend the knowledge acquisition done by classification. For example, in a problem 
containing married which does not state that it is symmetric, classifying the concept 
acquires this property because there is a specialization of relation for symmetry. 
However, classifying married will not acquire the constraint that at most two people 
are married to each other because no specialization of relation captures a constraint 
like that. 

The system always assumes that a problem is incomplete and applies introduction 
rules whether or not a specialized representation captures all the stated constraints 
on a concept because classifying a concept so introduced may uncover additional 
missing information. Therefore, whenever a classification effort terminates, if any 
introduction rules have been collected, they are tried without regard to the problem 
statement. 

Classification of a concept introduced in this manner often yields constraints on the 
new concept. These are also constraints on the original concept. The statements 
of these new constraints are reformulated in terms of the original concepts because 
the system compares alternative concepts based on the cost of the constraints on 
them. For example, the FAMILIES problem is stated in terms of child and is missing 
the constraint on the number of parents of a family member. Classifying child does 
not uncover this missing constraint. However, a rule found while classifying child 
introduces the concept parents and classifying it does uncover this constraint because 
the range elements of parents are sets and set has a specialization that captures size 
constraints. When parents is classified, the following statement is added to the 
formulation of the problem in terms of parents: 

yxVy\/z\y £ parents(x) Act parents(x) A y ^ z =• parents(x) = {y,z}]. 

Versions of a statement added in this way must be translated into the alternate 
formulations being investigated. This is done by returning to the logical definition of 
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the concept that is being classified and using this definition to derive a rewrite rule 
"that goes in the other direction." For example, when the statement above is added 
to the problem, the logical definition of parents in terms of child, 

VxVy[x £ parents{y) <=> child(x,y)], 
is used to derive a rule that rewrites the newly acquired statement to an equiva- 
lent statement in terms of child. The result is 



\/x s iy'iz[child{y,x) A child(z,x) A y ^ z A child(w,x) =• w = y V 



/ w 



5.3.2 Comparing Alternative Formulations 

Decisions to choose from amongst alternative formulations require a comparison of 
their relative costs. The cost of a formulation is the sum of the costs of the concepts 
in it. The cost of a concept is the cost of the machinery capturing the constraints on 
it. 

In general, some of a concept's constraints get captured by its representation designed 
by classification and the rest get captured by procedures operationalization generates 
for them. Thus, the cost of a concept is the cost of its representation designed by 
classification plus the cost of the machinery generated by operationalization. 

To facilitate estimating the cost of a concept's representation, each library structure 
has a cost estimate associated with it. This makes it straightforward to estimate 
the cost of a concept's representation: the cost estimates of its component structures 
are simply added together. The cost estimate for all library types except trans-rel 
and antitrans-rel is 1; the estimate for both of these is n. These estimates are 
explained below. 

The system has an evaluation function that, given a statement in implication normal 
form, estimates the cost of the procedures that operationalization will generate to 
capture that statement. A concept's estimate is related to the complexity of the 
procedures enforcing the constraints on that concept. The estimator takes advantage 
of the fact that its results will be used solely for the comparison of alternative formu- 
lations to simplify the estimation task. The estimates it produces need not be and 
are usually not accurate complexity measures. 
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The estimator also takes advantage of the restricted structure of the procedures gen- 
erated by operationalization. These are always in the form of nested loops. For 
example, consider the following statement of transitivity: 

y X VyVz[R(x,y) A R(y,z) =?■ R{x,z)}. 
The procedure that operationalization generates for this statement runs every time a 
new < x,y > pair is added to a problem situation. When it runs, x and y are bound 
to constants, but z is unbound. Therefore, the procedure must search through the 
list of existing pairs related by R for any of the form < y,z > and, for each such pair, 
adds a pair of the form < x,z >. This procedure performs one loop over the pairs 
related by R to enforce the transitivity constraint each time it is called. 

Complexity estimates are computed for statements in as just illustrated: The vari- 
ables in the first atomic formula are assumed to be bound. Then the procedure steps 
through the atomic formulas in the statement. For each, it checks to see if it contains 
unbound variables; then it assumes they are bound and continues. The number of 
loops is the number of atomic formulas with unbound variables in them. We will 
estimate the complexity of a procedure with no loops in it as 1, a procedure with one 
loop as ra, a procedure with a loop nested inside another as n 2 , and so on. Since, the 
procedure that operationalization will generate to capture transitivity has one loop 
in it, its complexity estimate is n. 

The estimates for library structures are computed (by hand) in the same way: For 
each procedure, we count the nesting of loops to arrive at an estimate for that pro- 
cedure. Then these estimates are added to obtain an estimate for the structure. 

The estimation procedure is more complicated than simply counting the atomic for- 
mulas that have unbound variables in them for two reasons. First, an atomic for- 
mula with unbound variables does not always require a loop. For example, sup- 
pose the range of the function spouse is family members. Then the atomic formula 
y = spouse{z) does not require a loop when y is unbound as long as z is bound. 
Therefore, the procedure is refined to check both the unbound variables and the 
representation of the concept mentioned in an atomic formula. Checking an atomic 
formula can yield one of three results: the formula can require no loops, one loop, 
or there can be no way of determining the possible bindings for its variables. When 
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the third case occurs in a statement, its cost is estimated as infinite. We have seen 
examples of the first two cases. As an example of the third case, consider estimating 
the cost of the following statement 

\/x\/y\/z{x = spouse(y) A y — spouse(z) => x ^ spouse(z)}. 
Assuming that x and y are bound, note that z is unbound in the second atomic 
formula. In general, there is no way to determine all the individuals whose image 
under a function is equal to a particular individual. Therefore, there is no way to 
determine possible bindings for z in this statement and its cost is estimated to be 
infinite. 

The second complication in the estimation procedure is that sometimes reordering 
the conjuncts in the antecedent makes the estimate lower. For example, the cost 
estimate of the statement above becomes 1 when its conjuncts are reordered because 
y is bound when we reach the second conjunct in 

VxVyVz[y = spouse(z) A x = spause(y) => x ^ spouse(z)}. 
In light of this complication, the cost estimator considers all permutations of the 
antecedent atomic formulas and returns the lowest cost estimate. 

The estimate for the complexity of a collection of statements is a polynomial. The 
worst this polynomial can be is the sum of the estimates for the individual statements. 
It can be less than the sum of the individual statements because sometimes constraints 
get captured as a side effect of capturing others. In light of this, an estimate is 
calculated for the uncaptured statements mentioning a concept by systematically 
operationalizing the statements starting with those having the lowest estimate and 
working up until all the statements are captured. The estimate for the collection 
is then the sum of the estimates for the statements actually operationalized. This 
estimate is added to the estimate for the concept's representation to obtain the total 
estimate for a concept. 

Consider an example of comparing the cost of alternative formulations: A problem 
formulation in terms of married is shown in Figure 5.4 and an equivalent formulation 
in terms of spouse is shown in Figure 5.5. 

Married is classified as an irreflexive, symmetric, antitransitive relation. The resul- 
tant representation, married, captures all but the size constraint. The cost estimate 
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Irreflexive: \/x^married(x,x) 

Symmetric: \/x\iy[married(x,y) <^> married(y,x)] 

Antitransitive: VxVy\/z\married(x,y) A married(y,z) => -<married(x,z)] 

Size constraint: V 'x\/y\? 'z[married(x ,y) A married(x,z) => y — z] 

Figure 5.4: A set of statements expressing constraints on married. 

Irreflexive: V;c x ^ spouse(x) 

Symmetric: \/xVy\x = spouse(y) <^> y = spouse(x)} 

Antitransitive: \/xiy\/z{x = spouse(y) A z — spouse(y) => x ^ spouse(z)} 

Size constraint: \/x\/y\/z{x = spouse(y) Ax — spouse(z) => y — z] 

Figure 5.5: Formulation of the statements in terms of spouse 

for married is computed from the library as n + 2, i.e., the sum of the estimates 
for irr-rel, sym-rel, and antitrans-rel, which are 1, 1 and n respectively. The 
evaluator estimates the cost of the size constraint in Figure 5.4 as n, making the total 
cost estimate for married 2n + 2. 

Spouse is classified as a partial one-to-one function. Spouse captures the size con- 
straint at a cost of l. 3 An estimate of 1 is computed for antitransitivity in Figure 5.5 
using the fact that spouse is represented in terms of function and reordering the 
conjuncts in the antecedent. The cost estimates calculated for other two uncaptured 
statements is also 1, making the total estimate for spouse 4. Therefore, spouse is 
preferred over married. 

Decisions about which alternative formulations to keep are complicated by the fact 
that better problem formulations can often be found by including more than one 
alternative concept in a formulation. To allow the representation design system to 
find these, the comparison procedure is generalized to consider all subsets of a set 
of alternatives; it then chooses the subset of alternatives that has the lowest cost 
estimate. 

When two subsets have the same cost estimate, the set whose concepts are closer 
to the initial concepts is preferred. When this rule does not establish a preference 



3 Recall that the representation design system determines which statements are captured by cap- 
ture verification. For example, to determine that the size constraint is captured, capture verification 
creates a problem situation in which an anonymous individual x is the spous* of another individual 
y and x is the spouse of z. A procedure associated with 1-1 function forces y to be equal to z. 
Thus, the consequent is true and the statement is captured. 
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between two subsets with the same cost estimate, one is chosen arbitrarily. 

Normally the logical relationship between two alternative concepts is kept implicitly 
in the rewrite rules that generate one formulation from another. For example, the 
equivalence between spouse and married is kept in the rewrite rule that generates the 
formulation in terms of spouse from the formulation in terms of married. However, 
when a formulation contains alternative concepts, statements expressing the relation- 
ship between each alternative must be added to the formulation. For example, the 
system must add the statement, 

VxVy[married(x,y) ■*=> y = spouse(x)}, 
to a formulation that includes both married and spouse. 

The general procedure for computing the cost of a set of more than one alternative 
concept first computes the cost of each concept in isolation. To compute the cost of a 
set of concepts, it identifies those statements left uncaptured by all the concepts in the 
set. Next it determines the minimum cost of each of those statements, which requires 
comparing the costs of the alternative formulations of each statement. Finally it 
sums the costs of the representations of each concept, sums the minimum costs of 
the uncaptured statements, and adds to this the cost estimates for the statements 
relating the alternative concepts. 

To illustrate this general comparison procedure, let us return to the example above 
and compare the cost of {married, spouse} to {married} and {spouse}. The cost 
of {married, spouse} is computed as follows. All the statements in the example are 
captured by the combination of married and spouse. Thus, the cost so far is simply 
the sum of the costs of married and spouse, which is n -f 3. Including the constraint 
between the two concepts increases the estimate to n + 4. Since the formulation in 
terms of spouse alone has cost 4, it is preferred. 



5.3.3 An Example Choosing Multiple Alternatives 

A case where it is more cost effective to keep multiple alternative concepts is illus- 
trated by the extended classification of the child relation in a problem whose relevant 
statements are shown in Figure 5.6. 
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Irreflexivity: Vx^child(x,x) 

Antisymmetry: Vx\/y[child(x,y) => ->child(y,x) 

Antitransitive: VxiyVz[child(x,y) A child(y,z) => ^child(x,z) 

Size constraint: Vxiy\/z"iw[child{y,x) A child(z,x) At/^A child(w,x) => 

w — y V w — z] 
A mixed constraint: \lx-child{A,x) 

Figure 5.6: Statements relevant to child. 

Child is first classified; when this completes two introductions occur. One introduces 
parents; the other introduces children, a function from individuals to their set of 
children. After these are classified, it turns out that the size constraint in Figure 5.6 is 
captured by parents and the mixed constraint is captured by children. It also turns 
out that even though keeping both of these in the representation requires maintaining 
the relationship between them, this representation is still more cost effective than 
keeping any single concept, as shown below. 

The classification effort begins with the child relation, which is classified as irreflexive, 
antisymmetric and antitransitive. Two introduction rules are found: one introduces 
children, the other introduces parents. Children is a function from family members 
to sets of the form {y | child(x,y)}. The range of children is a sort called child-set. 
Note that the rule that introduces this is the same rule that introduced spouses for 
married. The reason that two introductions were not tried for married is that the 
node for irreflexive, symmetric, antitransitive relations has only one introduction rule 
associated with it. This reflects the fact that there is no point in trying both rules 
when a relation is symmetric because the functions they create are equivalent. 

The two introductions cause two new formulations of the problem to be generated. 
The formulation in terms of children is shown in Figure 5.7 and the formulation in 
terms of parents is shown in Figure 5.8. 

Recall that the system contains a simplifier that partially canonicalizes statements. 
The new form of the mixed constraint in Figure 5.7 is the result of reformulation 
followed by simplification of the statement \/x~>child(A,x). First, the reformulation 
rule introduced with children rewrites the statement as 

Va; x $ children(A). 
This is simplified to 
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children(A) = 0. 4 

Irreflexivity: \/x x $ children(x) 

Antisymmetry: \/xiy[y G children(x) =^> x g children(y)] 

Antitransitive: \/x\/y\/z[y G children(x) A:G children(y) =$> z (jL children(x)] 

Size constraint: VxVy\/z\/w[(x G children(y) Ait children(z) A y ^ z 

Ax G children(w)) 
=> w = y V w = z] 
A mixed constraint: children(A) = 

Figure 5.7: Example problem rewritten in terms of children 



Irreflexivity: "ix x g" parents(x) 

Antisymmetry: VarVy[x G parent s(y) => y $ parents(x)} 

Antitransitive: yx'iy\/z[x G parents(y) A y G parents(z) => x g parents(z)] 

Size constraint: \f xVy\/ zVw[(y G parents(x) A 2 G parents(x) Ay ^ z 

Aw G paren^(x)) 

A mixed constraint: Vx .4 G" parents(x) 

Figure 5.8: Example problem rewritten in terms of parents 



In comparing the concepts child, children, and parents, the system cre- 
ates formulations for {child, parents}, {child, children}, {parents, children}, and 
{child, parents, children}. Initially, the formulations for the subsets with more than 
one concept contain only the constraints between the concepts in the subset. The 
formulation for {child, children} contains the single statement: 

'ix\/y[y G children(x) «■ child(x,y)\, 
which is just the logical definition of children. The formulation for {child, parents} 
contains the single statement: 

\/x\/y[x G parents(y) v±- child(x,y)]. 
The initial formulation for {children, parents} contains the single statement: 

VxVy[y G children(x) <^> x G parents(y)}. 
Initially, the formulation for {children, child, parents} contains the three constraints, 



4 The rewriting system that implements this simplifier is discussed in more detail in appendix A. 



5.3. EXTENDED CLASSIFICATION 121 

\/x\/y[x £ parents(y) <=> married(x,y)\ 
\/x\/y[y 6 children(x) <^> married(x,y)} 
\Jxiy\y € children(x) *$■ x (E paren^((/)j 

The extended classification effort proceeds by classifying parents and children. The 
details of this are given later. Of interest for the current example are the four repre- 
sentations that result from the classification: 

(deftype child-set specializes dis joint-set (family-member) ) 

children: partial-function(f amily-member , child-set) 

(deftype parent-set specializes 

fixed-size-dis joint-set (2, family-member) ) 

parents: function (family-member, parent- set) 

To decide which formulation to keep, the system first estimates the cost of each 
concept. The child relation captures all but the size constraint and the mixed 
constraint at a cost of n + 2. The cost of the size constraint is n 2 . Recall that we 
want the system to reformulate mixed statements as specific statements. To integrate 
this preference with the existing mechanism for exploring alternative formulations, 
the system estimates the cost of mixed statements as infinite. Therefore, the cost of 
{child} is infinite. 

The parents function (along with parent -set) captures the size constraint at a 
cost of 1 but does not capture any other constraints. Both the irreflexivity and 
antitransitivity constraints have estimates of 1. However, as in the formulation in 
terms of child, the cost of the mixed constraint in this formulation is estimated as 
infinite, making the total cost of parents infinite. 

Initially, the children function (along with child-set) captures only what was the 
mixed constraint in the other formulations. The cost of these representations is 1. 
Irreflexivity and antisymmetry are estimated to have cost 1, the cost of antitransitiv- 
ity is estimated as n, and the size constraint is estimated to have cost n 2 . Therefore, 
the total cost estimate for {children} is n 2 + n + 3. 

The next step is for the system to compare the cost of all subsets of the three concepts 
that have finite estimates. There are three such subsets in this example: children, 
{children. parents}, and {child, children}. 

The total cost of {children, parents} is determined to be 6 as follows. The total 
cost of the representations is 2. The statements left uncaptured by both concepts are 
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irreflexivity, antisymmetry, and antitransitivity. Irreflexivity and antisymmetry each 
cost 1 in both formulations. Antitransitivity has a cost of 1 in the formulation in 
terms of parents and n in the formulation in terms of children. Therefore, the system 
includes the version expressed in terms of parents in the formulation containing both 
concepts. The cost of the constraint between parents and children, i.e., 

VxVy[x G parents(y) ^j/£ children(x)}, 
is estimated to be 1. 

Finally the costs of the three alternatives are compared: {children} has cost n 2 + n+Z, 
{child, children) has cost n 2 + n + 4 and {children, parents} has cost 6. Thus, the 
{children, parents} alternative is chosen. 

Figure 5.9 shows the collection of statements associated with this alternative. The 
statements in this figure are those whose cost estimates were used to calculate the 
total cost. Note that since irreflexivity and symmetry have the same cost estimate in 
both formulations, the statements expressing those constraints are chosen arbitrarily 
from the parents formulations. 

Irreflexivity: Vz x g" parent s(x) 

Antisymmetry: VxWy[x £ parents(y) => y g - parents(x)} 

Antitransitive: \/xiy\/z[x £ parents(y) A y G parents(z) => x $■ parents(z) 

Size constraint: \/x\/y\/z\/w{(y G parents(x) A z G parents(x) A y ^ z 

Aw G parents(x)) 
=>w = y\/w = z] 
A mixed constraint: children(A) = 

Interconcept constraint: VxVyfy G children(x) <^> x G parents(y)} 

Figure 5.9: Formulation in terms of both parents and children. 



5.4 Special Types of Introduction Rules 

In general, extended classification applies introduction rules as has been explained: 
All the rules associated with nodes reached in classification of a concept are collected. 
When classification terminates all rules are tried and the resulting alternatives are 
compared. This procedure is required because, in general, alternative concepts cap- 
ture different subsets of the constraints on a concept. 



5.4. SPECIAL TYPES OF INTRODUCTION RULES 123 

Currently there are two types of special introduction rules for which the general 
comparison procedure can be circumvented. One is called a replacement rule. An 
introduction rule is a replacement rule in either of the following two cases: 

1. The rule introduces an alternative concept whose representation captures the 
same constraints as the representation of the existing concept but is more 
efficient. 5 

2. The rule introduces an alternative concept whose representation is just as effi- 
cient and captures a superset of the constraints of the existing concept. 

Because these rules replace one concept with another, when classification of a con- 
cept reaches a node with such a rule, classification of the concept need not continue. 
Therefore, replacement rules are applied immediately, in contrast to the normal pro- 
cess of waiting until classification of the existing concept terminates. 

One example of a replacement rule is shown in Figure 5.10. It introduces the new 
function /' which is defined so that where the value of / is the singleton set {A}, the 
value of /' is A. 

"When a sort S is represented as 

(deftype S specializes fixed-size-set (1, si)), 
then introduce a function that "flattens" S as follows. Let / be the defining function 
for S and suppose / is represented as 

f: function(s2,S). 
Then introduce /' with the definition 

VxVy[x = f'(y) & x € f(y)], 
represented as 

f: function(s2,sl)." 

Figure 5.10: Rule that introduces a function into individuals when an existing func- 
tion is into sets of size 1. 

This rule is attached to the fixed-size-set node in the sort hierarchy. An example 
of this rule's use is given in the next section during the classification of spouse-set. 
Since non-empty spouse sets all have size one, the rule introduces the function spouse, 
from an individual to that individual's spouse. 

Note that the efficiency difference may be more fine grained that the cost estimation procedure 
can measure. 
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The rule is a replacement rule because case one above applies: f ' captures the same 
constraints as the combination of f and S and f ' is more efficient, removing a "level 
of indirection" through the singleton sets. 

The second special type of introduction rule is called an extension rule. An extension 
rule introduces a concept that is not an alternative to any existing concept but which 
results in a more efficient representation. An example of this type of rule is attached 
to the dis joint-set node in the sort hierarchy. When the members of a sort are 
found to be disjoint sets, this rule introduces a function from the members of those 
sets into those sets. For example, members of parent-set are found to be disjoint 
and so this rule introduces a function parent-set-of, mapping an individual into the 
parent set that he/she is a member of. For instance, if A and B are parents of C, 
then parent-set-of [A] = parent-set-of {B) = {A,B}. 

The function introduced by this rule extends a problem representation so that it 
captures disjointness efficiently. A straightforward way to enforce disjointness is every 
time a new set of this type is created (or an existing set is modified), search a list of 
all other sets (of this type) to check for an intersection. If an intersection is found, 
combine the two sets involved. Introducing the new function takes advantage of the 
fact that every individual is in at most one of the sets of this type to avoid ever 
having to perform the above check to enforce disjointness. 

Having direct access from an individual to the set that it is a member of increases 
the efficiency of the representation in other ways as well. For example, introducing 
parent-set-of in the FAMILIES problem reformulates the statement, 

VarVyVcf x £ parents(c) Ay E parents(c) A x ^ y => 
couple(x) = couple(y) A x ^ y] 

as 

VjjVj/VcI parent- set-of(x) — parents(c)/\ 

parent-set-of(y) = parents(c) Ai^j/^> (1) 

couple(x) — couple(y) Ax ^ y\. 

From this statement, the simplifier first produces 

VzVy[ {parent-set-of {x) = parent-set-of {y) A x ^ y) => 
{couple{x) — couple(y) A x ^ y)}. 

The above statement is then further simplified to 
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\/xVy[parent-set-of(x) = parent-set-of(y) =>■ couple(x) — couple(y)}. 

As noted, this statement is recognized as a subsumption constraint and is captured 
in the type hierarchy. 



5.5 Extended Classification of Married 

This section illustrates the extended classification process with the classification of 
married. We begin with a summary of the main steps. After married is classified 
as an irreflexive, symmetric, antitransitive relation, the function spouses is intro- 
duced along with spouse-set, a sort for the range of spouses. During the classifica- 
tion of spouses, a partial function, called non-empty-spouses, is introduced. This 
maps from family members to non-empty spouse sets. Next, since range elements of 
non-empty- spouses all have size one, non-empty-spouses is replaced by the concept 
spouse, mapping a family member to his/her spouse. Spouse is classified as a partial 
one-to-one function and is determined to be its own inverse. This causes the partial 
function couple to be introduced, which maps an individual to the couple he/she is 
a member of. 



5.5.1 Introduction of Spouses 

As discussed in the last chapter, married is classified as a irreflexive, symmetric, 
antitransitive relation, placing us at the shaded node in Figure 5.3. As noted earlier, 
the system finds and applies the rule associated with this node which was shown in 
Figure 5.1. The result of applying this rule is to introduce spouses as 

VxVy[i/ € spouses(x) <^ married{x,y)\. 

Representations are defined for spouses and its range (spouse-set) as 

(deftype spouse-set specializes set (family-member)) 
spouses: function (family-member, spouse-set). 

The logical definition above is used to generate a formulation of the problem in terms 

of spouses. The original formulation of the problem is shown in Figure 5.11 and the 

formulation in terms of spouses is shown in Figure 5.12. 
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married(Q, P) 

\/x\/yyc[child(x,c) A child(y,c) A x ^ y => married{x 1 y)] 
\/x~married(x, x) 

\/x\/y[married(x, y) <=> married(y, x)] 
\/xiyiz[married(x, y) A married(y, z) => married(x, z)} 

Figure 5.11: Statements from small FAMILIES problem mentioning married 
P 6 spouses{Q) 

\/x^yMc[child{x,c) A child(y,c) A x ^ y =■ y t spouses(x)} 
Vx x rf: spouses(x) 

VxVyfy £ spouses(x) <=>■ x £ spouses(y)} 
VxVyVzIz 6 spouses{y) Aj/G 5pou5e5(z) => x $ spouses(z)} 

Figure 5.12: Statements from small FAMILIES problem rewritten in terms of spouses 



5.5.2 Classification of Spouses 



Next, classification of spouses is initiated. This requires prior classification of its 
range spouse-set because the sorts a concept is defined over are always classified 
before the concept. Since spouse-sets are sets, classification begins at the set node 
in the sort hierarchy. The first question is whether spouse-sets all have a fixed size. 
The representation design system can not find any information about this in the 
problem statement, so it asks the following question: 

Do sets of the form {y I married(x.y)} all have the same size? 

No. 
The next question is whether all members have the same maximum size. Again the 
system asks. 

Do sets of the form {y |married(x,y)} all have the same maximum 
size? Yes, 1. 

Next is the question of whether their are empty spouse sets: 

Can sets of the form {yl married (x,y) } be empty? Yes. 

At this point classification has reached the leaf node for empty members (Figure 5.13) 
which is checked for introduction rules. Thcfollowing rule is found: 
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"When a sort 5, whose members are sets, contains empty members, introduce 

a partial function into the non-empty members of S as follows. Let / be the 

defining function for S and suppose the representation for / is 

f : function(sl.S). 

Then define the function non-empty- f as 

VxVy[y = non-empty- f(x) & y = f(x) A f(x) ^ 0] 

Vx\/y[non-empty-f(x) = ± ^ y = f( x ) A f(x) = 0]. 

Also introduce representations for non-empty- f and its range: 

(deftype non-empty-S specializes S) 

non-empty-f: partial-function(sl, non-empty-S)." 

In this case, the rule introduces a sort whose members are non-empty spouse sets 
with the definition, 



VxVy[x = non- empty- spouses(y) & x = spouses(y) A spouses(y) ^ 0]. 



UNIQUE 
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FIXEtK^ 
SIZE v-^ 
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Figure 5.13: Node reached while classifying spouse-set 

This causes the formulation of the problem shown in Figure 5.14 to be generated and 
classification of non-empty-spouse-set begins. The first question that classification 
raises is whether individuals of this sort all have the same size. The representation 
design system deduces that they all have size one from the following three facts: 

1. non-empty-spouse-set is a subsort of spouse-set 

2. members of the collection spouse-set have maximum size one 

3. members of the collection non-empty-spouse-set are non-empty 
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P £ non- empty- spouses(N) 

yx\/y\/c[child(x, c) A child(y, c)Ax^y=>ye non-empty- spouses(x)} 
\/x x $ non-empty- spouses(x) 

VxVy[y G non-empty- spouses(x) «x£ non-empty- spouses(y)} 
\/x\/yVz[x G non- empty- spouses(y) Aye non-empty-spouses(z) => 

x £ non-empty-spouses(z)} 

Figure 5.14: The problem rewritten in terms of non-empty-spouses 

5.5.3 Introduction of Spouse 

This classification effort has now reached the node for fixed size sets (shown shaded in 
Figure 5.15). The replacement rule of Figure 5.10 is associated with this node. The 
rule applied in this case replaces the function non-empty- spouses with the function 
spouse. The rewritten version of the problem is shown in Figure 5.16. 
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Figure 5.15: Sort hierarchy with node for fixed size set shaded. 

Vx x ^ spouse(x) 

VxVy[x = spouse(y) <£> y = spouse(x)} 

VxJy\/z[x = spouse(y) Az = spouse{y) => x ^ spouse(z)} 

Vxiy\/z[x = spouse(y) A x = spouse(z) => y = z\ 

\/xVyVc{child{x, c) A child(y, c)Ax^y=>x = spouse(y)} 

Figure 5.16: Statements expressing constraints on married formulated in terms of 
spouse 



Notice that this rule would not have been applkd if members of the col- 
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lection non-empty-spouse-set had any other size but one. In that case, 
non-empty-spouse- set would be redefined in terms of fixed-size-set. 

5.5.4 Classification of Spouse 

The function spouse is classified as partial and one-to-one. The system determines 
that spouse is partial from the fact that non-empty- spouses is. It determines that 
spouse is one-to-one from the following problem statement, derived by the introduc- 
tion process: 

"ixMy[x = spouse(y) ■$$ y = spouse(x)}. 
It also determines from this statement that spouse is its own inverse. 

This classification effort terminates at the nodes for partial function and one-to-one 
function (shaded in Figure 5.17). 

5.5.5 Introduction of Couple 

The node for one-to-one functions has the following introduction rule associated with 
it: 

'"If a one-to-one function / is its own inverse, introduce a sort for sets of the 

form {x, f(x)}, and a function /' mapping an individual into its pair as follows. 

Let the representation of the function be 

f: l-l-function(s,s). 

Introduce the function /' as 

V*Vy[/'(y) = f'(x) A x ^ y ** x = f(y)}. 

Then introduce a representation f ' -pair for the range of /' as 

(deftype f -pair specializes f ixed-size-disjoint-set(2,s)). 

Introduce the representation f ' as 

f : function(s,f ' -pair)." 

This rule exploits the fact that a function being one-to-one and it being its own 

inverse means that the sets so introduced are disjoint from each other. In the case 

of our example, the rule introduces the concept of married couple and produces 

the formulation of the problem shown in Figure 5.18. Note that the statement of 

irreflexivity in this figure is equivalent to true. Therefore, it is removed. 

The next step is the classification of couple and its range, call it married-couple. 
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Figure 5.17: Nodes reached in classifying spouse. 
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couple(P) = couple(Q) A P ^ Q 

\/x\/yic\child{x, c) A child(y,c) A x ^ y => couple(x) = couple(y) Ay ^ x\ 
\fx[couple{x) ^ couple(x) V x = x] 

yxVy[couple(x) = couple(y) A y ^ x ^ couple(y) = couple(x) A x ^ y] 
\/x\/yVz[(couple(y) = couple(x) Ax ^ y A couple(z) = couple(y) A y -£ z) => 

couple{z) 7^ couple(x) V x = z] 

Figure 5.18: The problem rewritten in terms of couple 



The classification of married-couple is trivial since the associated representation 
is already defined to have fixed sized disjoint members. The classification of couple 
yields the following definition 

couple: partial-function(f amily-member, married-couple) . 

Couple is determined to be partial from the fact that spouse is. 

The final step in this example is to compare the alternative formulations: when couple 
is compared with spouse, the system determines that couple is cheaper; couple is also 
found to be cheaper than married. 

All of the statements in this formulation, except the one mentioning child, are cap- 
tured by the combination of couple and married-couple. 

One final note. The fact that couple is partial is not used in this problem. However, 
suppose the problem also contained the statement, "A is not married," i.e., 

s ix^married{ A, x). 
When spouses is introduced, the following form of this statement is added to the 
problem, 

spouses(A) = 0. 
Then when non-empty- spouses is introduced, the following statement is produced, 

non-empty-spouses(A) =J_, 
which is eventually rewritten as 

couple(A) =_L. 

When a problem situation is created from a statement like this one, the library type 
part ial^f unction records that the domain element involved has no image under 
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this function. If a subsequent attempt is made to assign non-empty-spouses(A) a 
value, a contradiction is signalled. 6 

5.5.6 Summary of Extended Classification of Married 

Recall that before the extended classification of married, the small FAMILIES prob- 
lem was in the state shown in Figure 5.19. Also recall the following statements 
are added by knowledge acquisition during the classification of married (and before 
introducing spouses): 

P :, family-member, Q : family-member, 

R : family-member, S : family-member 

grandchild{Q, S) 

Vx child(P, x) <£> x = R 

married(Q, P) 

VxVy[grandchild(x,y) <^> 3z(child(x,z) A child(z,y))} 

\/x*iy[child(x,y) <& parent{y,x)\ 

yxiy"ic[child{x,c) A child(y,c) Ai^t/4- married(x,y)} 

\/x\/y\fc[married(x,y) A child(x,c) => child(y,c)} 

Query: find-all x: parent(S,x) 

Figure 5.19: The small FAMILIES problem before classification of married. 

\/x^married(x, x) 

yxiy{married(x,y) <& married(y,x)\ 
Vx\/y\/z\married(x,y) f\married(y, z) =p- -<married(x,z)]. 

For the sake of clarity in the example of the extended classification of married, we 

assumed that the these statements were already in the problem. Note that the size 

constraint is not present in Figure 5.19 and is not acquired during classification of 

married. After spouses is introduced, the classification of spouse-set uncovers the 

size constraint. 

Figure 5.20 shows the state of the problem after extended classification of married 
and Figure 5.21 shows the representation. The overall effect of this has been to acquire 
missing constraints and to capture several statements with the specialized represen- 
tation of couple. The captured statements are enclosed in a box in Figure 5.20. 



6 0r if non-empty-spouses(A) already has a value when this statement is added to a problem 
situation, a contradiction is signalled. 
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P : family-member, Q , / amily-member) , 
R : f amily-member , 5 : f amily-member 
grandchild(Q, S) 

\Jx child(P,x) <=> x = R 

couple(P) = couple(Q) A P ^ Q 

yx\/y[grandchild{x,y) <£> 3z(child(x,z) A c/«7i(z,t/))] 

Vx\/y[child(x,y) ■& parent(y, x)\ 

\/xVy3c[child(x, c) A child(y,c) A a; ^ j/ =£> couple(x) = couple(y) Ay ^ x\ 

\/xiy\fc[couple{y) = couple(x) Ay ^ x A child{x,c) => child(y,c)} 



Va;V7/[co«p/e(x) — couple(y) A y ^ x 4=> couple(y) = couple(x) A x ^ y] 
VxVy\/z[(couple(y) = couple(x) Ax ^ y A couple(z) — couple(y) A y ^ z) 

couple(z) 7^ co«p/e(x) V x = z] 
\/xMyiz[couple(x) — couple(y) A couple(x) = couple(z) => y — z] 



Query: find-all x \ parent(S,x) 

Figure 5.20: The state of the small FAMILIES problem after extended classification 
of married. 



(deftype family-member specializes unique-individual disjoint) 

child: relation (family-member, family-member) 

parent : relation(f amily-member, family-member) 

(deftype married-couple specializes 

fixed-size-dis joint-set (2, family-member)) 

couple: partial-function(f amily-member, married-couple) 

Figure 5.21: Representation of the small FAMILIES problem after extended classifi- 
cation of married. 
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5.6 Extended Classification of Child 



This section takes up the extended classification of parents and children, concepts 
introduced during the classification of child. In addition to presenting another ex- 
ample of extended classification, this section illustrates that there can be interaction 
between the classification of closely related concepts. This can affect classification 
and knowledge acquisition. In this example, we will see that the classification of 
members of the sort parent-set as disjoint affects the classification of and knowledge 
acquisition for child-set, the sort of the range of children. 

We begin with a summary: The child relation is classified as irreflexive, antisym- 
metric, and antitransitive. The node reached as a result of this has two introduction 
rules associated with it which introduce parents and children. In this example, we 
arbitrarily begin with the classification of parents. Parent-set (the range of parents) 
is classified as a sort whose elements are fixed size disjoint sets. 

When the two new concepts are introduced, alternative formulations of the problem 
in terms of both of them are introduced. Also a formulation containing both concepts 
is introduced. This formulation contains a statement expressing the relationship be- 
tween parents and children. The representation design system uses this relationship 
and the fact that parent-sets are disjoint to deduce that child-sets are also disjoint. 
Eventually the functions parents' and children' are introduced. Parents' is a one- 
to-one function from child-set to parent-set, while children' is a one-to-one function 
from parent-set to child-set. These two are found to be inverses of each other. These 
discoveries contribute to the preference for a formulation of the problem in terms of 
both parents' and children'. 

The details of this example follow. This time we will suppress the knowledge acquisi- 
tion done by the system and simply begin with the statements shown in Figure 5.22. 
These are the statements from the small FAMILIES and acquired by knowledge ac- 
quisition that mention child. 
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Irreflexivity: Vx^child(x,x) 

Antisymmetry: \/xVy[child(x,y) =>• ->child{y,x)} 

Antitransitivity: Vx\/yVz[child(x,y) A child(y,z) =» -ic7u7d(a;,z)] 

Size constraint: \/x\/y\/z\/w[child(y, x) A child(z, x) Ay ^ z A child{w,x) 

=>w = y\Jw = z\ 
The mixed constraint: Vx child(P,x) $$ x = R 
child/couple: V W?/V 'z[child(y ; x) A child(z, x) A y ^ z => 

couple(y) — couple(z) Ay ^ z] 

Figure 5.22: Statements from small FAMILIES mentioning child, including those 
acquired during classification. 

5.6.1 Classification of Parents 

Parents is a function mapping family members to their set of parents. Figure 5.23 
gives the problem statement rewritten in terms of parents. Before parents can be 
classified, parent-set must be, because the sorts that a concept is defined over must 
be classified before the concept. This begins at the top of the sort hierarchy. Members 
of the sort parent-set are sets, so the system asks whether parent sets have a fixed 
size (yes, everyone has two parents). 

Irreflexivity: \/x x $ parents(x) 

Antisymmetry: VxVy[a; € parents(y) => y $. parents(x)} 

Antitransitive: VxiyVz[x € parents(y) Aye parents(z) => -*x <£ parents(z) 

Size constraint: VxVt/VzVujfj/ 6 parents(x) AzG parents(x) Ay ^ z 

Aw € parents(x) 

=>w = y\/w = z] 
The mixed constraint: Vx P € parents(x) •&■ x — R 
parents/couple: V;cVt/Vz[y € parents(x) A z G parents(x) A y ^ z =$> 

couple(y) = couple(z) A y ^ z] 

Figure 5.23: Example problem rewritten in terms of parents 

Continuing, we come to the question of whether parent sets are disjoint (yes). This 
places classification at the nodes for fixed size sets and disjoint sets (shown shaded 
in Figure 5.25). Recall that there is an extension rule associated with the node for 
disjoint sets. The rule is shown in Figure 5.24. 

In the case of our example, the rule introduces the function parent-set-of a mapping 
from a family member to the parent set he/she is a member of. Along with this 
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"When the range S of a function / has elements that are disjoint sets, introduce a 
function mapping an individual of the element sets in to the set it is a member of as 
follows. Let S be represented as 

(deftype S specializes dis joint-set (si)). 
Introduce the function map-S as 

VxVy[map-S(x) = f(y) & x G /(</)], 
with representation 

map-S: functional, S)." 

Figure 5.24: Rule that, given a sort of disjoint sets, introduces a function mapping 
an individual to the unique disjoint set it is a member of. 

introduction is a rule that rewrites terms of the form x G parents(y) to the form 
parent-set-of(x) — parents(y). The result of rewriting the problem is shown in 
Figure 5.26. 
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Figure 5.25: Node reached in classifying parent-set. 



This rule also rewrites the constraint between parents and children in the formulation 
of the problem involving both: 

\/xiy[x € children(y) <S> y € parents(x)}; 
it is rewritten as 

*ixiy[x E children(y) <$ parent- set- of (y) = parents(x)]. 
This completes the classification of parent-set, so the representation design system 
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Irreflexivity: Vx parent-set-of(x) ^ parents(x) 

Antisymmetry: \fx\/y[parent-set-of(x) = parents(y) =>• parent-set-of(y) ^ parents(x)] 

Antitransitive: \/xiy\/z[parent-set-of{x) = parents(y) A parent- set-of(y) = parents(z) 

=> parent-set-of(x) ^ parents(z)] 
Size constraint: \/x\/y\/z\/w[parent-set-of(y) = parents(x) A parent- set- of(z) = parents(x) 

Ay ?i;A parent- set-of(w) = parents(x) 
=> w = y V w = 2] 
A mixed constr: Vz parent-set-of(A) ^ parents(x) 
parents/couple: Myi z\parent-set-o f {y) = parent-set-of(z) A y ^ z ^~ 

couple(y) = couple(z)} 

Figure 5.26: Example problem rewritten in terms of parent-set 

returns to the classification of parents which is classified as a function that is not 
one-to-one. 



5.6.2 Introduction of Children' 

Classification of parents ends at the nodes for functions that are not one-to-one 
(shown shaded in Figure 5.27). This node has the following introduction rule associ- 
ated with it: 

"If a function that is not 1-1 has range elements that are disjoint sets, then 
introduce a representation for sets of domain elements that map to the same 
range element. By definition, the sets in this new set are also disjoint, so 
introduce a 1-1 function from the range of the original function to the new 
set. More precisely, let / be a function that is not 1-1 and let it be represented 
as f : function(sl ,S) 
and let 5 be represented as 

(deftype S specializes disjoint-set(s2)). 
Then introduce the function /' as 

VxVy[y € /'(*) <* * = /Ml> 

introduce a representation for the range of /' as 

(deftype f'-ran specializes disjoint-set(sl)), 

and introduce a representation for /': 

f: f unct ion (s 2, f'-ran)." 

To see that the sets introduced by this rule are disjoint, consider the current case. The 

range elements of parents are disjoint sets. If we collect together sets of individuals 

that have the same parents, i.e., sets whose elements have the same image under 

parents, then we. have created a set of. disjoint : sets. So the rule introduces the sort 
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children' -ran whose members are sets of family members with the same parents. 

Ii then introduces a 1-1 function that we call children which maps parent sets to 
children sets such that if an individual x is a member of the parents of y, then the 
children'-ran of y equals the children' of the parent-set-of(x), i.e., 

VxVy[y G children'(x) ■& x = parents(y)}. 

As usual, this definition is used as a rewrite rule that generates a new formulation of 
the problem. It also generates a new version of the statement, 



\/x\/y[x € children(y) v4> parent- set- of(y) = parents(x)}. 

(Recall that this is the result of rewriting the constraint between parents and 

children, done when the function parent-set was introduced.) The new version of 
this statement is 



VxVy[x G children(y) <=> x t childr en' '(parent- set-of(y))}, 



i.e., 

Vy[children(y) = children' '(par ent-set-of(y))}. 

An important inference is made from this statement: the range of children' is deter- 
mined to be the same as the range of children (i.e., the child-set = children'-ran). 
From this fact, the representation design system chooses one of the sorts to be the 
representative for both of them. Note that, because of this, the representation design 
system now knows that members of the sort child-set are disjoint. 



5.6.3 Classification of Children' 



The system now classifies children' which, as usual, requires prior classification of its 
range. This has just been determined to be child-set. Thus, the system begins with 
the classification of child- set. Members of this sort do not have a fixed size nor do 
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yxiy[child-set-of{x) = children(y) <£> x G children(y)]. 

Because the range of children' is child-set, the following additional rewrite rule is 
introduced 

Vx\/y[child-set-of(x) = children'(y) ^ x £ children'(y)]. 

These two rules rewrite the statement 

Mx\ly[x 6 children(y) »xG children 1 (par ent-set-of(y))\ 



as 



Va;Vyi child- set- of(x) — children(y) <=> 

child- set- of(x) = children'(parent-set-of(y))}. 

This completes the classification of child-set; the classification of children' can now 
be completed. It is a partial one-to-one function. 

Since children' was an alternative introduced for parents, a cost analysis of both 
concepts is performed. The estimates come out the same, so for the moment the 
system records a preference for parents (because parents is closer to an initial concept 
than children'). 

5.6.4 Classification of Children 

Now classification of children is initiated (child-set has already been classified). The 
concept children is a partial function (because child sets can be empty) that is not 
one-to-one. This places classification at the nodes for partial functions and functions 
that are not 1-1 (shown shaded in Figure 5.27) where we have been before. Recall 
that there is an introduction associated with this node. When the range elements of 
the function being classified are disjoint sets, this rule reflects that range through the 
function to its domain, creating sets of domain elements that have the same range. 
The sets of domain elements created in this way are disjoint. The rule also introduces 
a one-to-one function from the new sets to the original range elements. In this case, 
the rule can be applied because the range of children are disjoint sets. It introduces 
a sort whose members are sets of parents with the same children. Also it introduces 
the one-to-one function parents' with the following definition, 

tfx\/y[y €. parents' (x) <£» x = children(y)}. 
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This is used to rewrite the problem statement 

yx\/y[ child- set- of(x) = children(y) ■& 

child- set- of(x) — children'(parent-set-of(y))} 



as 



Va;Vy[ y G par ents '(child- set-o f (x)) <=> 

child-set-of(x) — children'(parent-set-of(y))] 

Recall that earlier the sort parent-set was introduced. Members of this sort are 

sets of parents. When parent sets where found to be disjoint, the partial function 

parent- set-o f was introduced, mapping a family member into the parent-set he/she 

is a member of. The rewrite rule derived from the definition of parent-set-of now 

rewrites the statement above as 

VxVyl parent-set-of(y) = parents'(child-set-of(x))Leftrightarrow 
child- set-of(x) — children'(par ent- set-o f (y))} 

This statement is recognized by classification as defining parents' and children' as in- 
verses. Recall that earlier the system recorded a preference for parents over children'. 
Now the system has determined that children' is the inverse of the concept parents'. 
A formulation in terms of both of these is preferable to parents. This is the final 
result of the extended classification of child. It was first reformulated in terms of 
parents and children and then these two concepts were reformulated as parents' and 
children'. 

5.6.5 Summary of Extended Classification of Child 



This section summarizes the effect of extended classification of child on the small 
FAMILIES problem. The state of the problem after extended classification of married 
is shown in Figure 5.28. 

Knowledge acquisition during classification of child adds the following statements: 
\/x^child(x,x) 



vxvy 



chil 



l*vy; 



4- -c 



\y,x) 



\/x\/y\lz[child{x,y) A child(y, z) => ->child(x,z). 

Parents and children are introduced; knowledge acquisition during classification of 

parents adds the size constraint: 

"ixiyiziw[(y G parents(x) Az € parents(x) Ay ^ z Aw G parents(x)) 

-•>■■=> w — yVw = z] 
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P : / amily -member , Q : family-member, 

R : family-member, S : family-member 

grandchild(Q, S) 

Vx child(P,x) O x = R 

couple(P) = couple(Q) A P ^ Q 

\/x\/y[grandchild{x, y) O 3z(child(x,z) A child(z,y))] 

\txiy\child(x,y) & parent(y,x)} 

\/x\/y\/c[child(x, c) A child(y, c) Ax ^ y =>■ couple(x) = couple(y) A x ^ y] 

\/xiy\/c[couple{y) — couple(x) Ay ^ x A child{x,c) => cfez7c?(t/,c)] 



\/x\/y[couple(x) = couple(y) A y ^ x <$■ couple(y) — couple(x) A x ^ y] 
\/x x iyiz[{couple(x) — couple(y) A x ^ y A couple(x) = couple(z) A x ^ z) 



Query: find-all x | parent(S,x) 

Figure 5.28: State of small FAMILIES after extended classification of married 

The extended classification continues yielding the problem statement shown in Fig- 
ure 5.29 and the specialized representation shown in Figure 5.30. The first box in 
Figure 5.29 encloses the statements captured by couple, while the second box en- 
closes those captured by parent' and children'. The statement preceded by an 
asterisk is captured in the type hierarchy. 

5.7 Deriving New Mixed Constraints 

This section discusses how the system finds implicit restrictions. Chapter 3 explained 
that when a concept has an explicit restriction on it, a representation is included for 
that concept even if it is not primitive. 7 For example, the presence of the statement 
Vx^brother( M, x) in the big FAMILIES problem causes brother to be included in the 
representation even though it is defined in terms of sibling and male. Representations 
are included for these concepts because we want restrictions to be reformulated as 
specific statements. The only concepts that are reformulated are those that get 
classified and only concepts with representations get classified. 

The system identifies explicit restrictions by looking for mixed statements of certain 
forms in a problem. Clearly this approach will not identify implicit restrictions. For 
example, the statements 



7 Recall that a restriction is a mixed statement that restricts the number of individuals that can 
stand in-some relation to a specific individual. 
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children'(parent-set-of(P)) = {R} 
couple(Q) = couple(P) A Q + P 

3z[ children' '(parent- set- of(Q)) = child-set-of(z)A 
children '(par ent-set-of(z)) = child- set-of(S)] 
\/x\fy[child(x,y) 4=> parent(y,x)\ 
(*) ^x\/y[parent-set-of(x) = parent- set-of(y) => couple(x) = couple(y)\ 

\/x\/y[parent-set-of(x) ^L Acouple(x) = couple(y) =>• 

parent-set-of(x) = parent-set-of(y)] 
Vx x € parent-set-of(x) 
Vx x € couple(x) 



Vx\/y[couple(x) = couple(y) An/yo couple(y) - couple(x) Ax^y) 
VxVyVz[couple(x) = couple(y) Ax ^ y A couple(y) = couple(z) A y ^ z => 

couple(x) ^ couple(z) V x = z] 
\/xVy\/z[couple(x) = couple(y) Ai/j/A couple(x) = couple(z) A x ^ z => y 



Vx child- set-of(x) ^ children' (parent- set- of(x)) 
VxVy[ child- set- of(y) = children' (parent- set- of(x)) =>• 
child- set-of(x) ^ children'(parent-set-of(y))} 

VxVj/V ' z[(child-set-of{y) = children'(parent-set-of(x))A 
child- set- of(z) — children' (par ent-set-of(y))) => 

child- set-of(z) ^ children'(parent-set-of(y))} 

VxVt/VzVzo[ child- set- of(x) — children' (par ent-set-of(y)) A 

child- set-of(x) — children '(parent- set- of(z)) Ay ^ zA 
child- set- of(x)= children '(par ent-set-of(w)) => 

to = y V «) = z] 



Query: find-the parents' (child- set- of(S)) 
Figure 5.29: Formulation of small FAMILIES after extended classification of cfoW. 
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(deftype family -member specializes unique- individual disjoint) 

(deftype parent-set specializes 

fixed-size-disjoint-set (2, family-member)) 

(deftype child-set specializes disjoint-set (family-member) ) 

parent -set -of : partial-function (family-member, parent-set) 

child-set-of : function(family-member, child-set) 

parents' : l-l-function(child-set , parent-set) 

children' : 1-1-function (parent-set , child-set) 

parent : relation(f amily-member, family-member) 

(deftype married- couple specializes 

fixed-size-disjoint-set (2, family-member) ) 

couple: part ial-funct ion (f amily-member, married-couple) 

Figure 5.30: Representation of small FAMILIES after extended classification of child. 



\/x^sibling(M, x) 

\i xiy\brother(x ,y) <^> sibling(x,y) A male(y)} 
imply a restriction on brother even though neither is a mixed statement involving 

brother. 

The system detects implicit restrictions as problem statements get reformulated to 
remove explicit restrictions. For example, the presence of the explicit restriction on 
sibling above will cause a reformulation in terms of the function siblings, a mapping 
from an individual to his/her set of siblings. This will cause the second statement 
above to be rewritten as 

VxVy[brother(x,y) <^yG siblings(x) A male(y)]. 
When the concept siblings is determined to be preferable to sibling, the system 
reformulates the problem in terms of brothers. This rewrites the above statement as 

\/x\/y[y € brothers(x) O- y £ siblings(x) A male(y)}. 

The idea behind identifying implicit restrictions is that any time a concept is refor- 
mulated to remove a restriction (i.e., turned into a specific statement), the system 
adds restrictions for any concepts that define subsets of the original concept because 
we know that all subsets of a restricted set will also be restricted. For example, 
brothers(x) is a subset of the siblings(x). 

The system uses one other rule to identify implicit restrictions. When a set of sets 
{si, . . .,s„} forms a covering of another set 5 and each a restriction has been identified 
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on each of the Sj, the system restricts S. For example, In a problem where there are 
restrictions on brother sets and sister sets, the presence of the statement 

\/x\/y{sibling(x,y) <=> brother(x,y) V sister(x,y)} 
causes the system to place a restriction on sibling. In this case, reformulation and 
simplification transform this statement into 

\/x[siblings(x) = brothers(x) U sisters(x)}. 



5.8 Detecting Redundant Introductions 

It is common for more than one introduction rule to introduce the same concept 
during design. For instance, recall the example in section 5.3.2 where the concepts 
children and parents were introduced for child. It turns out that the following 
introduction rule is associated with the node for functions that are not one-to-one: 

"When the range elements of a function / are sets, i.e., when / is represented 

as 

f: function(sl ,s2) 

and s2 is denned as 

(deftype s2 specializes set(s3)), 

introduce the function /' as 

Vx\/y[x e f(y) »ye f'(x)\, 

introduce a representation for its range as 

(deftype F'-ran specializes set (si)), 

and represent /' as 

f ': f unct ion (s3, f'-ran)." 

When classifying children, this rule introduces a concept that is equivalent parents; 
when classifying parents, it introduces a concept that is equivalent to children. The 
system will name these equivalent concepts differently and will not know initially 
that they are equivalent. 

In general, detecting that concepts are equivalent is difficult. However, there is an 
efficient way to determine when the system has introduced a concept that is equivalent 
to a concept already introduced. A derivation is kept for each introduced concept 
giving the sequence of introduction rules used to derive it. The system detects the 
equivalence of two introduced concepts by comparing their sequences. 



146 CHAPTERS. CONCEPT INTRODUCTION 

A set of equivalence reductions is denned over these sequences so that all sequences 
will be transformed into a unique shortest sequence. This shortest sequence is the 
canonical member of a class of equivalent sequences starting from the original concept. 
If two concepts have equal canonical sequences they are equivalent. For example, the 
derivation given for the version of children introduced directly from child is, 

(left-pro j child), 
where "left-proj" is the name of the rule that introduces a concept children with 
the definition 

VxVy[x 6 children(y) <=> child(y,x)\. 
The derivation of an equivalent concept introduced from parents is, 

(swap right-proj child), 
where swap is the name of the rule given earlier in this section and "right-proj" 
is the rule that introduced parents from child. 

One reduction states that (swap right-proj) should be reduced to (left-proj). This 
reduces the second derivation of children so that it is equal to the first. 

Of course, this technique does not help with the more difficult issue of detecting 
equivalence between two concepts appearing in the initial problem. 



5.9 Chapter Summary 

This chapter has explained how concept introduction is used to extend classification 
by introducing new concepts, giving classification new ways to view existing con- 
cepts. Introduction rules are attached to nodes in the structure library hierarchies. 
Classification of a concept collects the rules found at nodes that it reaches. When 
classification of the concept terminates, the collected rules are applied. They intro- 
duce new concepts, defining them in terms of the concept that was being classified. As 
new concepts are introduced, the system explores alternative problem formulations. 

If analytical reasoning problems were complete (i.e., contained all the information 
necessary to solve them), the system would only have to apply introduction rules after 
classification of a concept when statements mentioning only that concept remained 
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uncaptured. However, since these problems are incomplete, introductions are always 
tried because classification of the concepts they introduce can uncover additional 
missing constraints. 

Alternative problem formulations are compared by estimating the cost of capturing 
the problem constraints in that formulation. Formulations with lower cost estimates 
are preferred over equivalent formulations with higher cost estimates. The cost of a 
formulation is the cost of the concepts in it. The cost of a concept is the cost of the 
machinery capturing the constraints on it. In general, some constraints on a concept 
are captured by its representation designed by classification and the rest are captured 
by procedures written by operationalization. 

The state of the small FAMILIES problem at the beginning of extended classification 
is shown in Figure 5.19. The set of represented concepts is {married, child, parent}. 
The system performs extended classification on each of these. Extended classification 
of married yields the problem shown in Figure 5.20; the representation produced is 
shown in Figure 5.21. Extended classification of child results in the problem shown 
in Figure 5.29 and the representation is shown in Figure 5.30. Finally the system 
considers classification of parent but finds that it has already been reformulated as 
parents' and the find-all query transformed into find-the par ents' '(child- set-of(S)). 

Extended classification of the small FAMILIES problem is now complete. The prob- 
lem statement and representation passed on to operationalization are those shown in 
Figure 5.29 and Figure 5.30. 



Chapter 6 
Operationalization 



Operationalization is the representation design system's way to capture the con- 
straints that extended classification fails to capture. 

In describing how operationalization captures constraints, it is useful to view a spe- 
cialized representation as a black box. Inside the box there is a collection of data 
structures that represent problem situations. The user of the box creates problem sit- 
uations by "telling" the representation about the specifics of a problem. For example, 
in creating a situation for the FAMILIES problem, the user "tells" the representation 
that N is married to P, Q is the grandfather of S, etc. 

Operationalization writes procedures that enforce constraints by watching for the 
execution of tell operations (or combinations of tell operations) that violate a con- 
straint. They respond to the execution of such operations by executing additional 
tell operations to reestablish the constraint. For example, consider the following 
constraint: 

^xiy[x S siblings(y) »i/G siblings(x)]. (1) 

One procedure that operationalization writes to capture this watches for the exe- 
cution of an operation that adds an individual x to the siblings of another individual 
y. This operation might violate the constraint because it may not be the case that 
y is in the siblings(x). The procedure reestablishes the constraint by executing an 
operation to add y to the siblings(x). 

As in this example, each constraint left uncaptured after extended classification is 
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"turned into" one or more procedures that reestablish the constraint. The number of 
procedures created per constraint is a function of the number of tell operations that 
can affect the constraint. This is determined from the operations of the data types 
used to define the representations in the constraint. For example, sibling sets are 
represented as sets. In addition to an operation that adds elements to sets, there 
is another operation that allows all the members of a set to be specified by equating 
the set to a constant, e.g., siblings(A) = {B, C}. To fully capture the constraint of 
(1), operationalization must also write a procedure that responds appropriately when 
an equate operation is executed. 

When procedures have been generated that respond to all operations that can affect 
a constraint, there is no way to create a situation that violates the constraint, i.e., 
the constraint is captured. 

In general, operationalization captures a constraint in three steps: (i) it determines 
which representation operations can affect the constraint, (ii) it determines what ad- 
ditional operations should be executed to reestablish the affected constraint, and (iii) 
it extends the representation with procedures that perform those additional opera- 
tions whenever one of the affecting operations is executed. 

For example, the constraint of (1) is captured as follows. Operationalization first 
identifies the operations that can affect sibling sets. The representation sibling-set 
is defined in terms of set. Suppose that set has only two tell operations that get 
used in building FAMILIES situations: ADD-ELEMENT(y,x) (add x to the set y) 
and EQUATE-TO-CONSTANT(x,y) (equate the set x to the constant set y). 

Operationalization determines the additional operations that must be executed to 
maintain the constraint of (1) whenever the above operations are executed and 
then writes two procedures. One procedure responds to the execution of an ADD- 
ELEMENT(siblings(x),y) by performing an ADD-ELEMENT(siblings(y),x). The 
other procedure responds to an ASSIGN-TO-CONSTANT(siblings(y),x) by perform- 
ing an ADD-ELEMENT(siblings(z),y) for each z in the constant set x. 

Thus, operationalization converts a general statement into a collection of procedures 
that respond to the execution of tell operations by executing other tell operations. 
Since executing a tell operation corresponds to adding a specific fact to a problem 
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situation, these procedures can be thought of as compiled lemmas that draw conclu- 
sions when new specific facts are added to problem situations. These conclusions are 
additional specific facts that must be added to the problem situation to maintain a 
constraint. 

6.1 Preprocessing for Operationalization 

In the previous chapters, figures illustrating the state of the small FAMILIES problem 
contained statements with named individuals in them. This was done to clearly 
illustrate the effects of representation design on the problem. As noted, the system, 
in fact, works with a description of a problem class. The figures in this chapter 
will use the class description directly. Figure 6.1 shows the class description for 
the small FAMILIES problem after extended classification. Many, if not all, of the 
problem statements have been reformulated in terms of representations introduced 
during extended classification. At this point, the description may contain general 
statements with existential quantifiers in them such as 

yx\/y[grandchild(x,y) <=? 3z(child{x, z) A child(z.y)}. 

3x3y[cauple(x) = couple(y)} 
3x3y[children'(couple(x)) = y) 
3x3y3z[ parent-set-of(x) = parents(child-set-of(z))A 
parent-set-of(z) = parents(child-set-of(y))} 
yx\/y-j>arent-set-of(x) ^_i_ Acouple(x) — couple(y) 

=? parent- set- of{x) — parent-set-of(y)} 
\/x[parent-set-of(x) ^-L=> x (E parent-set-of(x)} 
\/x\couple(x) ^J-=> x € couple(x)} 
Vx x £ child- set-of(x) 

Figure 6.1: The state of the small FAMILIES class description after extended classi- 
fication. 

It may also contain statements defining concepts that do not have representations. 
The statement above is also an example of this second type: The previous steps of 
representation design did not design grandchild because grandchild is not primitive 
and does not have any restrictions on it. 1 Finally, statements can contain embedded 



^Recall that a restriction is a mixed statement restricting the number o£ individuals that stand 
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function terms, for example, 

3x3y3z[ parent- set- of(x) — parents(child-set-of(z))A 
parent-set-of{z) = parents(child-set-of(y))}. 

The class description is preprocessed for operationalization as follows: (i) existential 
quantifiers appearing in the scope of universals are removed, (ii) definitions of con- 
cepts that are not represented are expanded , and (iii) embedded function terms in 
specific statements are removed. 

6.1.1 Removing Existential Quantifiers 

The system removes existential quantifiers by skolemization. When a general state- 
ment contains an existentially quantified variable 2 , the system replaces that variable 
with a new function of the universal quantifiers in the surrounding scope. For exam- 
ple, the existential z is removed from the statement 

\/x\/y[grandchild(x,y) <?=> 3z(child(x,z) A child{z,y))\ 
by introducing a new function of x and y, F(x,y), and substituting it for z in the 
statement, yielding 

VxVy[grandchild(x, y) <=> child(x, F(x,y)) A child(F(x,y),y))}. 
A representation is also introduced for the new function, e.g., assuming that x is 
a member of a sort Si and that y is a member of a sort 5 2 : 

f: function(sl, s2) . 

Problem situations created with statements involving skolem functions refer to un- 
named individuals that are denoted by those functions. For example, in a situation 
created from the statement grandchild(A, B), there will be an unnamed individual 
that is a parent of B and a child of A. 



in some relation. For example, "A and B are the only grandchildren of C." 

2 Since skolemization is being done on statements in implication normal form, the system must 
check the sense of a quantified subexpression in the antecedent to determine whether it is existential 
or universal. For example, in the statement 

Vy((3x P(x)) => Q(y)}, 
the variable x is universally quantified. To see this, note that the statement is equivalent to 

Vyh(3x P(x))vQ(y)}, 
and to 

Vy[{Vx^P(x)) V Q(y)]. 
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6.1.2 Expanding Definitions 

Next defined concepts that do not have representations are identified, all occurrences 
of such concepts are expanded using their definitions, and then their definitions are 
removed from the problem. Since these concepts have no representations, it makes 
no sense to operationalization them in their current form. For example, operational- 
ization would generate a procedure for the definition of grandchild which added 
grandchild(A, B) to a situation in which there was a C such that child(A.C) and 
child(C, B). But such a procedure makes no sense since there is no representation 
for grandchild. The concept grandchild is an example of a concept that is not rep- 
resented in FAMILIES. Its definition (with existential quantifier removed) is used to 
expand the statement zx3y grandchild{x,y) into the statement: 

3x3y[ parent-set-of(x) = parents(child-set-of(F(x,y))) A 
parent-set-of(F(x,y)) — parents{child-set-of(y))}. 

Then the definition of grandchild is removed from the problem. 



6.1.3 Removing Embedded Function Terms 

Function terms denote individuals. A specific statement that contains embedded 
function terms is a special case of a statement that contains individuals in place of 
those function terms. For example, the presence of the above statement in the small 
FAMILIES problem implies that problems in that class can contain statements of the 
form 

3x3y'j>arent- set-o f (x) = parents(y)}. 
Therefore, embedded function terms are removed from specific statements in de- 
veloping the class description. The result of doing this for 

3xdy[ par ent- set-o f{x) = parents(child-set-of(F(x,y))) A 
- parent-set-of(F(x,y)) = par ents{child- set-o f{y))\. 
is 

3xi3yi parent-set-of(xi) — parents(yi)}. 

The preprocessed description of-the small FAMILIES class is shown in Figure 6.2. 
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3x3y couple(x) — couple(y) 
3y3y[children'(x) = {y}} 
3x3y parent-set-of(x) = parents(y) 

Vx\/y\parent-set-of(x) = parent-set-of(y) => couple(x) = couple(y)} 
VxVy\parent-3et-of(x) ^L Acouple(x) = couple{y)} 
Vx[parent-set-of(x) ^_j_=> x E parent-set-of(x)) 
\/x[couple{x) ^L=> x E couple(x)] 
Va; x E child- set- of{x) 

Figure 6.2: The class description for the small FAMILIES problem. 

6.2 The Operationalization Procedure 

Operationalization works from the class description, capturing constraints in three 
steps. In the first step it identifies the operational literals 3 in the class description. 
These are literals that correspond to operations of existing representations. For ex- 
ample, x € child- set-of(y) is an operational literal in the small FAMILIES problem 
because it corresponds to the ADD-ELEMENT operation of set and the class de- 
scription contains the statement 

Vz x 6 child-set-of(x) 

The system knows which literals correspond to operations because associated with 
each operation of the library structures is a schema giving the form of the literal that 
the operation corresponds to. For example, the ADD-ELEMENT operation has the 
schema x G 5 associated with it. 

In the second step of operationalization, the operational literals are used to rewrite 
uncaptured statements until they are sequences of implications involving only op- 
erational literals. A statement is rewritten by assuming an operational literal in it, 
simplifying the statement based on that assumption, and then forming an implication 
of the form 

assumption => simplified-statement. 
For example, given the operational literal x G siblings{y), the statement 

yxVy[x € siblings{y) & y a siblings(x)\, 



A literal is an atomic formula or a negated atomic formula. 
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is rewritten by assuming the literal in the statement, yielding 

true ^ y £ sibling s(x). 
This statement simplifies to y 6 sibling s(x). Then the following implication is 
formed: 

~ixiy{x € sibling s(y) => y £ siblings(x)]. 

More generally, an assumption is made in a statement and then a series of assump- 
tions are made in the final consequent of the statement until what remains is an 
implication whose antecedent literals are operational and whose final consequent is 
an unconditional statement involving operational literals. 4 A statement in this form 
is said to be in operational form. Each such sequence is called an ope rationalization 
sequence. 

Recall operationalization captures a constraint by designing a response to every com- 
bination of operations that .can cause the constraint to be violated. Each statement 
in operational form corresponds to one combination of such operations, with the con- 
sequent corresponding to the appropriate response. If operationalization produces 
a statement in operational form for every sequence begun from a constraint, the 
constraint is captured. 

The third step of operationalization is to translate the operational forms created by 
step two into procedures (or daemons) by treating the logical implication a => 3 as, 

"When a is true (or becomes true) do 3." 
For example, the statement 

'ix\/y[x G siblings(y) => y G siblings(x)} 
is translated into 

WHEN ADD-ELEMENT(siblings(y).x) DO ADD-ELEMENT(siblings(x),y) 
which is read, 

"When x is added to siblings(y), add y to siblings(x)." 
This is done by a process that is similar to determining that a literal is operational: 
each literal is matched against the schemas in the structure library to identify the 



4 An unconditional statement is one that is logically equivalent to a conjunction of literals. 
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operation that the literal corresponds to. 

Note that the procedure generated in the above example does not fully capture the 
constraint 

VxV(/!x t siblings(y) =-?/6 siblings(x)} 
because it does not capture the converse, i.e., 

\?xiy\y 'i siblings(x) => x $ sibling s(y)). 
This is just a special case of the general fact observed above that multiple opera- 
tionalization sequences are required to fully capture a constraint. 

The system handles this case by treating x £ siblings(y) as a different operation 
from x £ siblings(y). If a problem contains the operational literal x siblings{y), a 
separate operationalization sequence is started with the statement 

y^Vyu £ siblings(y) ■==> y E siblings(x)}, 

which yields: 

'ix\/y[x £ siblings(y) => y <£ siblings(x)). 

The next three sections of this chapter detail the three steps of operationalization 
outlined above. 

6.2.1 Identifying Operational Literals 

A literal is operational if, when interpreted procedurally, it is an operation supported 
by the representation in the formula. The system checks each literal appearing in 
the class description to see if it is operational. It does this by examining the repre- 
sentations of the concepts mentioned in the formula. This check differs depending 
on whether or not the relation symbol of the formula is a domain relation. When it 
is, the operations of that relation's representation are checked for a schema matching 
the literal. The problem statement language has two special relations in it, £ and 
= . All other relations must be domain relations. When a literal's relation symbol is 
one of the special relations, the system checks the representations of the terms in the 
argument positions of the formula. For example, in a literal whose relation symbol 
is G, the representation of the second argument is checked. Thus, x e siblings(y) is 
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operational because the range of siblings is represented in term of set. 

Note that a problem's representation may use library types with operations that are 
never used in building problem situations. The method outlined above for identify- 
ing operational literals identifies only operations that are actually used in building 
problem situations. 

Usually the procedural interpretation of an operational literal can either be a tell 
operation or an operation to test whether something is true of a problem situation. 
For example, A £ siblings(B) can be interpreted as, "add A to the siblings of B" or 
as. "test whether A is in the siblings of B.'' However, some literals can be interpreted 
only as tests. For example, A 6 -p. where ^ is a constant set, can be interpreted as a 
test, but not as a tell operation because we can not add an element to a constant set. 
Note that all literals that can be interpreted as tell operations can also be interpreted 
as tests, but some literals that can be interpreted as tests can not be tell operations. 

We will refer to operational literals that can be interpreted only as tests as test 
literals; those with both interpretations will be referred to simply as operational 
literals. Whether a literal is a test literal or an operational literal can affect the way 
a procedure is generated for a statement (the third step of operationalization). 

Not all literals are operational because the library structures do not have operations 
for every literal. For example, the statement 

V 'x\/y\siblings(x) = child- set-of(x) — {x}\ 
is not operational because set does not support a difference operation. As we will 
see, non-operational literals are operationalized just as if they were conditional state- 
ments. 

6.2.2 Operationalization Sequences 

The second step in operationalization is to use the operational literals identified in the 
first step to derive one or more statements in operational form from each uncaptured 
statement in the class description. As noted, each statement in operational form is 
derived by one operationalization sequence which repeatedly makes assumptions in a 
statement until a. statement in operational formis produced. This section details this 
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process for conditional statements and unconditional statements whose literals are not 
operational. Unconditional statements whose literals are operational are considered 
to already be in operational form and nothing is done to them here. 

The procedure for operationalization sequences works as follows: 

1. For each uncaptured statement <p and each oper ,tional literal that mentions a 
representation in the statement, begin an operationalization sequence by uni- 
fying the operational literals with the literals of the statement. Note that 
equalities between terms and constant are unified with statements in a special 
way as discussed below. Also note that equalities between non-constant terms 
correspond to an operation supported by an equality system that is built into all 
specialized representations. This is explained in Chapter 7. Assume that there 
are n operational literals mentioning representations in o and denote these by 
Qi, . . . ,a n . An operationalization sequence is begun by assuming a,- in <f> and 
then deriving an implication for each way that a; unifies with 4>. This is done 
as follows: 

(a) Find all literals in 6 that unify with a t . The result of one successful 
unification is a most general unifier. 

(b) For each unifier, 6, a new statement is constructed whose form is, 

ai[9] => 4>\e](a,[e}/true). 

The notation <p{6] stands for the result of making the substitutions given 
in in <p and <f>(a/,3) is the result of substituting 3 for every occurrence of 
a in 0. Thus the right hand side of the statement constructed is the result 
of making the substitutions given in the most general unifier in both 4> 
and a x and then substituting true for a x \0\ in <t>. 

(c) The resultant statement is simplified according to the rules given in Fig- 
ure 6.3. 

2. If the new statement is in operational form, the sequence is complete. The con- 
sequent of the new statement is either (i) an unconditional statement involving 
only, operational literals, (ii) the constant true, or (iii) the constant false. 
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^true > false -^false > true 

true A P > P false A P > false 

true V P > true false V P > P 

£rue => P > P P => £rue > true 

/a/se -- P > frwe P => /a/se > -P 

P <=> true » P P & false > ^P 

Figure 6.3: Simplification Rules 

In case (ii), the new statement is thrown away because a statement of the form 

O l => . ■■{0 n =- true) 
is logically equivalent to true. Put in terms of operationalization procedures, 
the statement can be thrown away because a procedure that adds true to a 
problem situation is useless. In case (iii), the final consequent of the new state- 
ment is false, i.e., 

<Pi => ■ ■•{4>n -' false). 
Such a statement is replaced by the logically equivalent statement 

3. If the new statement is not in operational form, operationalization tries to 
continue the sequence by making an assumption in the final consequent. 

(a) If no assumptions can be made then operationalization of <t> fails. In this 
case, the system fails to capture the constraint of ® in the designed repre- 
sentation. A representation that partially captures a problem's constraints 
can be used as a specialized reasoner working along side a theorem prover 
to solve problems. The theorem prover need consider only those statements 
that are not captured, leaving the constraints of the other statements to 
be enforced by the specialized representation. 5 

(b) If it finds an assumption in 3 in the statement, 

a x ^> (a 2 => ■■■(a n =>/?)) 
then the sequence is continued with, 



in 



3 I have not attempted to fully consider interfacing a specialized reasoner with a theorem prover 
my research. There- is -a body of research, on just this topic (e.g., jMiller k Schubert 88'). 
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"i => (a 2 => -(<*« => («n+i => flr)))i 
where oc n +\ is the new assumption and 3l is the result of substituting true 
for a n+ i in 5. 

As an example of this process, consider the statement 

\/x\/y{x £ siblings(y) <=> y £ 5i'6/ingf5(a;)] 
and the operational literal x x £ sibling s(y x ). 6 

An operationalization sequence is started by assuming X\ £ siblings(yi) in the above 
statement. This is done by finding unifications of the literal with literals of the 
statement. One such unification results in, 

\/xi\fyi : Xi £ siblings(yi) =- (true <=^ j/i £ -sii/irc^s^!))], 
which is simplified to, 

VcciV2/i[a;i £ sibling s(y x ) => y x £ .si&/m<7s(a;i)]. 
This statement is in operational form, completing the first sequence. 

Another operationalization is begun with the other possible unification of Xi £ 
siblings{y x ) with a literal of the original statement, i.e., y £ siblings(x). This results 
in the same statement as the first sequence. 

The above procedure is modified slightly when an assumption is an equality with a 
constant term, e.g., siblings(x) = ■■p. Instead of unifying the assumption with literals 
of a statement, unifications are found between the non-constant term (sibling s(x)) 
and terms of the literal in the statement being operationalized. Then for each uni- 
fication, the simplified form of the statement's consequent has p substituted for 
siblings(x). Thus, for each unification 6 found, the new formula generated for a 
statement of the form. 

«i =?" («2 => •••(«n - ^)), 
has 3 replaced by, 

F(x) = ip{0}=> 3(F(x)[6}/p). 

We continue the example from above with the operational literal sibling s(x 2 ) = <p, 



6 Variables with subscripts are used in these examples to make it clear when variable substitutions 
have occurred. 
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assuming it in 

'ixiy[x G sibling s{y) <=> y G sibling s(x)]. 
This is done by unifying siblings{x 2 ) with terms of its literals, then substituting 
* for sibling s{x 2 ). One term it unifies with is sibling s{y). The result of substituting 
x 2 for y and siblings{x 2 ) for siblings(y) is 

x G ~p -^ x 2 G siblings(x). 

The new statement created in this step is 

'ixix 2 [siblings{x 2 ) = p => {x G *? <f> x 2 G siblings(x))}. (2) 

This statement is not in operational form so operationalization looks for an assump- 
tion to make in the consequent. Suppose it assumes x G <p, deriving the formula 

\/xix 2 \ sibling s{x 2 ) = p => (x G <p =- {True <^ x 2 G siblings{x)))\, 
which is simplified to 

VxVx 2 [siblings(x 2 ) = p ^ (x E f ^ x 2 E siblings(x))]. 

Next the system assumes x 3 G 3iblings(x 4 ) in (2). This yields the operational form: 

VxiVxslsiblingsixi) = <p => (x 3 G siblings{x i ) => z 4 G y>)]. 
Since siblings(x 2 ) also unifies with the term siblings(x) in the statement 

Vx\/y[x G siblings(y) <^ y G siblings{x)}, 
operationalization finds another way to assume siblings(x 2 ) = ^ in the statement. 
This sequence results in the same operational form as the last sequence. 

When an operationalization sequence produces a statement whose final consequent 
is a non-operational literal, it continues to make assumptions in an effort to simplify 
it. That is, it continues just as though the final consequent of the statement were 
conditional. Unconditional general statements whose literals are not operational are 
also dealt with in this way. For example, the FAMILIES problem class description 
contains the statement: 

~ix[siblings{x) = child-set-of(x) - {a;}]. 
This is operationalized like any conditional general statement. Suppose, for instance, 
that the problem's class description contains the statement 
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nxi siblings(xi) = 0, 
making siblings(xi) = an operational literal. An operationalization sequence is 
started by assuming this literal in the statement, yielding 

siblings(xi) = =• child- set-of{x x ) — {x\} = 0. 
The next section explains that statements generated during operationalization are 
simplified. This statement is simplified by the mechanism described there to 

siblings(x) = =■ child-set-of(x) = {x}, 
which is in operational form. 

Rewriting Intermediate Statements 

As noted, statements containing the constants true and false are generated during 

operationalization and then simplified according to the rules given in Figure 6.3. 

In addition, the intermediate statements are subjected to a body of rewrite rules 

that attempt to further simplify them based on mathematical knowledge, mainly 

about sets. Simplifying rewrites are applied iteratively to statements until no more 

are possible. Here is an example of how the rewrite system is integrated with the 

operationalization process. Suppose an class description contains the statements 

\lxiy\x £ brothers{y) <=» x 6 siblings(y) A sex(x) = male] 
3z siblings{z) = 0. 
The system begins an operationalization sequence by assuming siblings(z) = in 

the general statement. This results in the statement 

"ix\/z[siblings(z) = => (x 6 brothers(z) <^> x 6 A sex(x) = male)}. 

There is a rewrite rule that embodies the fact that empty sets have no elements by 
simplifying the literal x G to false. This rule rewrites the above statement as 

\/xiz[siblings{z) = => (x £ br other s(z) <£> false A sex(x) = male)}, 
which is then simplified to 

V 'xV z[sibling s(z) = => x $. brother s(z)}. 
Another rewrite, which embodies the fact that sets with no elements are equal to 
the empty set, is applied to this statement; this yields 

ixVz siblings(z) = =±- br other s(z) = 01, 
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which is recognized as being in operational form. 

In general, the success of operationalization can depend on the simplification rules 
that the system has. A more complete discussion of the rewrite system, including its 
current collection of rules is given in appendix A. This collection has proven sufficient 
for the twelve problems that the system has been tested on. 

Optimizations of Operationalization 

Operationalization is expensive. To a certain extent this does not matter because the 
cost of operationalization is incurred at representation design time, not at problem 
solution time. This is especially true if the resultant representation will be used 
repeatedly because the cost of operationalization can be amortized. 

One reason operationalization is expensive is that it can generate a large number 
of procedures to operationalize a statement. This does effect the efficiency of the 
representation designed. The system has several strategies for reducing the cost of 
operationalization. For the twelve problems the system has been tried on so far, 
these strategies allow a representation to be designed and used to solve the problem 
in much less time than a theorem prover using brute force search in the original 
representation. 

One strategy that the system uses is to try to minimize the use of operationaliza- 
tion at all, by capturing the constraints of statements during extended classifica- 
tion. A second strategy is to make operationalization more efficient. One technique 
is to reduce the number of sequences generated for a particular statement. Other 
efficiency-enhancing techniques detect when operationalization produces statements 
that are subsumed by existing statements in operational form, and in that case avoids 
generating a procedure for them. 

The technique that reduces the number of operationalization sequences exploits the 
fact that separate statements with the same final consequents are equivalent if their 
sequences of antecedents are permutations of each other. For example, the following 
two statements are logically equivalent: 
P^(Q^ R) 

-Q => (P -• R). 
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It is also the case that the procedures generated for these two statements are equiva- 
lent. This fact is used to avoid operationalizing statements with all combinations of 
literals, reducing the number of operationalization sequences generated for a state- 
ment. 

The system currently has one technique to detect when it has generated a statement 
that is logically subsumed by another statement in operational form. One way to 
increase the efficiency of operationalization and of the representations the system 
designs is to add more techniques of this type. 

The existing technique exploits the fact that if a procedure is included for a statement, 
then including a procedure for a less general statement will not change the behavior 
of a representation. It removes any statement of the form 

4>1 => ■■■{<Pn => (01 A-A»J) 

if operationalization has generated a statement whose consequent is a conjunction 
of literals which is a superset of {0 1; ...,ip m } and whose antecedents are a subset of 
{4>\, ■ ■ ■ , 4> n }- For example, 

P=(Q\R) 
subsumes 

S=-(P^ R). 

To see that any statement of the second form will subsume any statement of the first, 
consider the following two step argument. First, as has been noted, the statement 

4>\ =-•••( On -• fc' ) 

is logically equivalent to 

(0! \ ••• A o n ) =■ y. 
In this second form it is clear that any statement obtained by removing literals 
from the conjunction in the antecedent is more general than this statement because 
it concludes the consequent from fewer conditions. Now consider the following two 
statements: 

<Pi => ■■■{On => (Vi A--- Mi> m )) (3) 

4>i =>---(4> n -i --(7iA---A 7m ,)). (4) 

From the previous step of the argument we know that if 
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{V>l,...,Vm} = {7l»--->7m'}, 

then (4) is more general than (3). Notice that this is still the case if 

{ip 1 , . . . , y m } C {'f 1 ,...,-lm'} 
because all of the conclusions of (3) are also conclusions of (4). 

6.2.3 Generating Procedures from Statements in Opera- 
tional Form 

This section first explains the process of generating procedures for unconditional 
statements, then for conditional statements. As a result of the last step, all statements 
in the class description are in operational form. An unconditional statement made 
up of operational literals is in operational form. An unconditional general statement 
is compiled into a daemon that responds to the creation of new individuals of the 
sort found in the statement. Its response is to execute operations corresponding to 
the literals in the statement. For example, the procedure generated for the statement 

Vx x £ couple(x) 
responds to the creation of a new married couple by adding an element to it. For 
instance, if a situation does not contain a data structure corresponding to couple(B), 
then adding a statement containing that term will cause such a data structure to be 
created. When this happens, the procedure that enforces the constraint of the above 
general statement executes the operation ADD-ELEMENT(couple(B),B). 

Recall that the specific statements of the class description are derived from the spe- 
cific statements of the original problem by replacing the individuals with existentially 
quantified variables. There is no point in translating the specific unconditional state- 
ments of the class description into procedures because we are interested in the actual 
problem situation. Instead we translate the original specific statements. For example, 
the statement couple{Q) = couple(P) is translated into code that creates a couple 
containing P and Q and this code is executed in creating the problem situation. 

An unconditional specific statement is translated into a sequence of operations. For 
example, the statement 

A e siblings(B) A Be sibling s{C) 
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is translated into 

ADD-ELEMENT(siblings(B),A) 
ADD-ELEMENT(siblings(C),B). 

Conditional statements are compiled into daemons that respond when operations 

corresponding to their antecedents are executed. The basic idea is to compile a 

statement of the form, 

<P\ => ■■■{On => (01 A ••• Al/> m )), 

where all the <j,- and %\ are operational literals, into a sequence of daemons. Each 
daemon in the sequence, except the last, waits for the execution of an operation and 
then creates a new daemon. The last daemon, e.g., the daemon corresponding to 
6 n —■ (i\ A • • • A tp m ) above responds to the execution of an operation corresponding 
to its antecedent 6 n by executing an operation corresponding to each literal in the 
consequent. 

Suppose that the operation corresponding to each of the fa above is $; and that the 
operation corresponding to each of the </>; is $;. Then the basic form of the procedure 
generated for the statement above is, 

WHEN $! DO 

WHEN $ 2 DO 

WHEN $ n DO *!,..., * m 

The execution of the statement "WHEN op DO ..." causes a daemon to be created 
that waits until an operation (op) is executed and responds by executing the state- 
ments after the DO. When an operation $x is executed, the above procedure executes 
a statement of the form, 

WHEN $ 2 DO 

WHEN * n DO *i,..., * m 

This creates a new daemon because the bindings for the arguments of $i are substi- 
tuted into $ 2 ? • • • i $n- As a concrete example, consider the statement 
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x € siblings(y) => (sex(x) = male => x G br other s(y)) . 
The procedure generated for this statement is, 

WHEN ADD-ELEMENT(siblings(y),x) DO 

WHEN ASSIGN-CONSTANT(sex(x),male) DO 
ADD-ELEMENT(brothers(y),x) 

When a particular operation is executed that adds an individual to the siblings of 
another individual, say ADD-ELEMENT(siblings(B),A), this daemon responds by 
executing the following statement: 

WHEN ASSIGN-CONSTANT(sex(A),male) DO 
ADD-ELEMENT(brothers(B),A) 

This creates a new daemon that waits for the execution of the operation ASSIGN- 
CONSTANT! sex( A), male). 

Note that WHEN does not create daemons like those in a standard object oriented 
programming language. A standard way to implement a daemon like the one in the 
above example is to produce code that executes after an ADD-ELEMENT message 
is sent of a sibling set object. The problem with this approach is that in specialized 
representations the object denoted by siblings( A) may have other names too, making 
it possible for an element to be added to it without explicitly referring to it as 
sibhngs(A). Therefore, there must be a method of identifying the names of an object 
when an operation is performed on it. The method that specialized representations 
use involves the integration of daemons with an equality system. This is explained 
in Chapter 7. 

To completely capture the constraint of a statement, the procedures generated for 
it must work independently of the order of satisfaction of the antecedent literals. 
One way to accomplish this is to allow operationalization to generate statements for 
all permutations of a statement's antecedents. This approach is too expensive: if a 
statement being operationalized has n antecedents, then n! statements are created to 
operationalize it. 

As noted earlier, the system has a more efficient approach in which the procedures 
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generated are insensitive to the order in which the operations of a statement's an- 
tecedents are executed. For example, the procedure generated for the statement 

x G siblings(y) —■ (sex(x) - male =>i£ brothers(y)) 

works even if the operation ASSIGN-CONSTANT(sex(A),male) is executed before 
ADD-ELEMENT(siblings(B),A). 

This is accomplished by compiling all but the first literal so that the procedure checks 
to see if the antecedent is true before generating a daemon that waits for it to become 
true. This check is generated from the test interpretation of the operational literal. 
For instance, the procedure generated for the example above involving siblings and 
brothers is actually compiled into 

WHEN ADD-ELExMENT(siblings(y),x) DO 

IF sex(x)=male THEN ADD-ELEMENT(brothers(y).x) 
ELSE WHEN ASSIGN-CONSTANT(sex(x),male) DO 

ADD-ELEMENT(brothers(y),x) 

The expression sex(x)=male is a test operation supported by the representation of 



sex. 



One additional complication arises in the compilation process. Sometimes when con- 
stants get substituted for variables in a statement, the result is literals which no longer 
have tell interpretations. For example, recall the operationalization of the statement 

VxVylx e siblings(y) <=> y G siblings(x)}, 
given the literal siblings{y x ) = ^ (where <p is a constant). One statement gener- 



ated 



is 



'ixvylsiblingsiy^ = y> => [y x <= siblings{x) => x G f)}. 
We can test whether an individual is a member of a constant set, but there is no 
operation that adds individuals to constant sets. 

This complication is straightforward to address. As explained above, literals are 
usually compiled into a test followed by a daemon that waits for a corresponding 
operation. Literals that correspond only to tests are compiled into tests with no 
following WHEN statement. For example, the literal x G <? in the above statement 
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is compiled into a test to make sure that the constant bound to x is a member of ip, 
i.e., signal a contradiction if its not. 

Some test literals are compiled into iteration constructs. For example, another state- 
ment generated in operationalizing 

\/-xiy[x G siblings(y) t=> y e siblings(x)}, 



is 



yxVy[siblings(yi) = y? =• U fc ,? -- j/i G sibling s{x))}. 

Note that unlike the example above, the first occurrence of a; is in a test literal. Since 
this does not correspond to an operation and since this is the first occurrence of £, it 
will be unbound when it comes time to perform the test. Because of this the literal 
x G f is compiled into the program fragment 

FOR EACH x e v? DO... 

In the previous example, the test literal is treated differently depending on whether 
its arguments are bound to constants at the time the test is performed. This is a 
special case of the following phenomenon: The ability to generate a test at all for a 
literal depends on the whether its arguments are bound when the test is performed. 
For example, consider the statement 

yxVy\/z[x = spouse(y) —• (y = spouse(z) => x = z)}. 
The procedure generation process for this statement begins with 

WHEN ASSIGN-CONSTANT(spouse(y),x) DO. . . 
In the next step, the system tries to generate a test for the literal y = spouse(z) 
and notes that z is unbound. No test can be generated for this because it requires 
checking the spouses of all individuals. 

The system attempts to deal with this problem by reordering the antecedents of a 
statement and trying to generate a procedure for the new statement. Continuing 
the above example, when the system finds that it can not generate a test for y = 
spouse(z), it changes the above statement to 

yxVyVzl/y = spouse(z) =- (a; = spouse(y) =» x = z)]. 
This time when the second antecedent is reached, y is bound and a test can be 
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generated. ' The procedure produced for this statement is 

WHEN ASSIGN-CONSTANT(spouse(z),y) DO 
IF x=spouse(y) THEN 

IF x^ z THEN contradiction 
ELSE WHEN ASSIGN-CONSTANT(s P ouse(y),x) DO 
IF x^ z THEN contradiction. 



6.3 Soundness of Operationalization 

Problem situations are built by executing the tell operations of a representation to 
add the specific facts of the problem. The data structures built always represent a 
set of specific facts. The information content of a situation is the set of specific facts 
that its data structures represent. 

To show that operationalization is sound, we must show that the procedures produced 
for a statement <p only add facts to problem situations that follow from cb. More 
precisely, let P be a procedure generated in operationalizing <j> and let $ be the 
information content of a problem situation at the time P is executed. Then we must 
show that the facts added to $ by P follow from the set of statements {$,</>}. 

We show this in two steps. First, we show that for any statement ib produced in oper- 
ationalizing <t>, <p r rb (i.e., ip follows from <f>). Then, we argue that, as a consequence, 
the procedure generated from ip must add facts to a situation that follow from <j>. 

Consider the process by which ib is generated from <j>. In the first step, an assumption 
a is made in © and the statement 

a => 4>' 
is produced. The expression 4>' in this statement is the result of substituting all 
occurrences of a in by true and simplifying. It is easy to show that 



'Notice that we could keep a list of all the individuals in the problem and then it would have been 
possible to operationalize the original form of this statement. The result would be a less efficient 
procedure. Thus, in this case, we got lucky. This illustrates the current method of reordering 
antecedents is not particularly robust. See [Smith85] for a more general discussion of antecedent 
reordering. 
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We prove this statement by contradiction. Suppose to the contrary that is true 
but that a =? 0' is false. For this to be so, a must be true and 4>' must be false. 
But because of the way <f>' was derived from d>, if a is true, cj> is the same as o' . This 
contradicts our assumption that 4> and <j>' have different truth values. 

Since V is the result of zero or more applications of the assumption making process, 

I" </>. 

Suppose that ip has the form 

01 =>•••(<?„ =* U'i A ••• A m )). 
The procedure P^, generated from v. will add the facts i{>i, ■ ■ ■ li'm to situations 
containing the facts 

0! A ••• A0„. 

Thus if P^ executes in situation $, it will add facts that follow from {$,0} and, 
since we have already shown that <b h 0, it follows that P^, will only add facts that 
follow from {$,0}. 

We would also like operationalization to be complete, i.e., we would like the pro- 
cedures produced for a statement <p add all specific facts to problem situations that 
follow from (f>. More precisely, let $ be the information content of a problem situation 
and let $ be any subset of $. We would like it to be the case that if has been 
successfully operationalized and a specific fact <x follows from the set of statements 
{ty,4>}, then a appears in $. 

However, operationalization is not complete. One source of the incompleteness is 
that the representations designed by the system can build only very limited kinds 
of disjunctive situations. In general, disjunction in a problem situation can not be 
represented. The difficulty is best illustrated by an example. Consider the following 
propositional problem. 
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P 

p ~. (w V T) 
P ^(RV S) 
(WAR) =-Q 
(WAS)=t-Q 
(TAR) =>Q 
(T AS) => Q 
Query: OQ. 

Q follows in this problem. However, consider a representation capturing the con- 
ditional statements of this example by operationalization. If we use it to create a 
situation representing P, this situation will not contain Q. The difficulty is that 
there is no way to create a problem situation representing W V T without committing 
to one of them actually being true. More precisely, to capture 

P^(W\IT), (5) 

operationalization includes procedures for the operational forms 

P ^ {^W => T) and 

P=>(^T => W) 
(among others). These are the only procedures included for (5) computing conse- 
quences of P. Therefore, when P is added to a problem situation in this representa- 
tion, no additional facts will be added to the situation. 



6.4 Summary 

This chapter has explained the process of operationalization, proven it to be sound, 
and demonstrated that it is incomplete when arbitrary disjunctive situations must 
be represented to solve a problem. Operationalization is the representation design 
system's way to capture the constraints that extended classification fails to capture. 
It begins with the statements left uncaptured by extended classification. Figure 6.2 
shows the statements left uncaptured by extended classification of the small FAMI- 
LIES problem. 

Operationalization of Small FAMILIES begins with the following operational literals: 

couple(x) = couple(y) 
children'(x) = {y} 
parent-set-of(x) = parent s(y) 
x € child- set-of(y). 
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Operationalization proceeds by assuming these literals in the appropriate general 
statements, trying to derive statements in operational form. 

Literals of the form F(x) ^J_ are treated specially in this process. Recall that this 
states that x has an image under the partial function F. During operationalization, 
if the class description contains any operational literals of the form F(x) = y (for 
any arbitrary term y) then there can be some individuals that have images under F. 
Therefore, if the literal F{x) ^1 appears anywhere in the class description, F(x) -£L 
is made an operational literal of the problem class. 

For example, the statements 

\/x[couple{x) ^-l=> x £ couple(x)} 
3x3y couple{x) = couple(y) 
in the small FAMILIES problem cause the statement 

3x couple(x) t^x 
to be added. As with any other operational literal, this causes an operationaliza- 
tion sequence to be started by assuming couple(x) ^J_ in 

\/x[couple(x) ^-L=* x G couple(x)} 
which results in this same statement. 

Literals of the form F(x) ^± are compiled into daemons that respond to the creation 
of range elements of the function F. For example, the procedure generated for the 
statement above responds to the creation of any coxiple(x) by adding x to it. 

The procedure generated for the statement 

yxiy\jparent-set-of(x) = parent- set- of(y) => couple(x) = couple(y)} 
responds to an operation that makes two individual's parent sets the same by making 
their couples the same. 8 

The procedure generated for the statement 

\/x\/y\parent-set-of{x) ^1 Acouple{x) = couple(y) 

=> parent-set-of(x) = parent-set-of(y)} 

does the following. When a parent set is created for an individual x who is an element 

of a couple, the parent set of the other individual in that couple is equated with the 



Chapter 7 explains what it means to make to structures like parent sets and couples the 
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sew ppsent set. 

Code i» also generated for tike specific statement 

ecutkm -of this code c***e» tlw situation 

to be crested. T hi* process is described in Cluster 7. 
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Chapter 7 
Representation Machinery 



This chapter details how the library types are implemented and how representations 
designed in terms of them are used to solve problems. It explains three underly- 
ing mechanisms that the representations rely on in building problem situations. It 
explains how the library types relation, function, set, and individual are imple- 
mented. Finally, it shows how the representation designed for the small FAMILIES 
problem is used to build one problem situation. 

Recall that library structures are implemented as parameterized abstract data types 
(ADTs). Each library ADT is a prototype for creating representations. Concept 
representations are created from ADTs in the concept hierarchy by instantiation. For 
example, an instance of type relation is used to represent married. This structure 
is created by the declaration 

married: relation (family-member, family-member). 

Representations of sorts are created from ADTs in the sort hierarchy by defining 
subtypes. For example, a representation of the sort family-member is created by 
the definition 

(deftype family-member specializes individual disjoint). 
Instances of the ADT family-member are used to represent family members like 

N. 



174 



7.1. UNDERLYING MECHANISMS 175 

7.1 Underlying Mechanisms 

Three facilities support specialized representations: an equality system, a mechanism 
that creates anonymous individuals, and a daemon invocation mechanism. These are 
closely integrated in the process of creating situations, so they are described together. 

7.1.1 The Equality System and Anonymous Individuals 

The equality system is similar to RUPjMcAUester82j. It maintains equivalence classes 
of terms known to be equal and supports an operation that merges equivalence classes 
as new terms are found to be or asserted to be equal. 

The implementation distinguishes terms, which are linguistic items, from objects, 
which represent semantic items, i.e, things in the problem domain. In this chapter, 
we will indicate that an item is a term using the italic font (e.g., .4) and that it is an 
object using typewriter font (e.g., A). 

Unlike RUP, our equality system requires that each equivalence class have exactly 
one object associated with it that is the object denoted by the terms of this class. 
For example, in the situation representing 

mother-of(A) = B, mother -of(C) = mother-of(A), 
the terms mother-of(A) and mother-of(C) are placed in the same equivalence class. 
The object associated with this class is B. 

The object denoted by an equivalence class may be unknown at the time the class is 
created. For example, a situation representing only the statement 

mother-of{C) = mother-of(A), 
has an equivalence class containing the terms mother-of(C) and mother-of(A), but 
the object denoted by this class is unknown. In cases like this one, the system creates 
an anonymous object and associates it with the class. The type of the anonymous 
object (i.e., the sort of the individual it represents) is determined from the range of 
mother-of to be family-member. 

The equality system also -allows two equivalence classes to be marked to indicate that 
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they are known to be disjoint (i.e., the individuals that the two classes denote are 
known to be unequal). Any attempt to equate individuals in classes so marked results 
in the equality system signalling a contradiction. 

When the equality system merges two equivalence classes, it creates a new class which 
is the union of the terms in the two original classes. Since each class denotes exactly 
one object, merging two classes requires equating the objects associated with them. 
To determine what object the new class denotes, the system sends a signal to the 
object associated with one of the original classes notifying it that it is being equated 
with the other object. This is called an EQUATE notification. When two objects are 
sent an EQUATE notification, they respond with a single object which the equality 
system associates with the new class. 

Accordingly, the library ADTs used to create objects are required to respond to the 
EQUATE notification. In the current library, the ADTs that do this are individual, 
set, and their specializations. 

To illustrate the merging process, consider the situation created to represent 

mother-of(A) = B, mother-of(C) = D. 
It will contain two equivalence classes: one containing the term mother-of{A) with 
the associated object B and the other containing the term mother-of(C) with the 
associated object D. Now suppose the statement motker-of(A) = mother-of(C) is 
added. The two classes are merged and the objects B and D are sent an EQUATE 
signal. They respond with a single object that can be referred as B or D (i.e., B = D). 

7.1.2 The Daemon Invocation Mechanism 

Section 6.2.3 explained how operationalization generates daemons for conditional 
statements that are not captured by classification. These can not be implemented in 
the way daemons are standardly implemented in object oriented programming lan- 
guages. Consider an example of the difficulty. One of the daemons operationalization 
generates for the statement 

Mxiylx t child-set-of{y) \ x ^ y => x E sibling s(y)} 1 



-Recall ttat the child:set-'o{ an individual is a set of individuals with the same parents as the 
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is 



WHEN ADD-ELEMENT(child-set-of(y),x) DO 

IF x ^ y THEN ADD-ELEMENT(siblings(y),x). 

A standard way to implement a daemon like this one is to produce code that exe- 
cutes after an ADD-ELEMENT message is sent to a child-set object, conditionally 
adding an element to a sibling-set object. The problem with this approach is that, 
in specialized representations, the object denoted by child-set-of(A) may have other 
denotations (i.e., other terms denoting it), making it possible for an element to be 
added to it without explicitly referring to it as child-set-of(A). For example, suppose 

child-set-of(A) = child-set-of(B). 
If we add an element to child-set-of(B), the standard implementation of the above 
daemon will only add that element to sibling s(B). 

In light of this difficulty, our daemon invocation mechanism makes use of the equiv- 
alence classes maintained by the equality system. Daemons are attached to terms 
rather than to objects. When an operation is performed on an object, the system 
searches through the terms denoting that object, invoking daemons that respond to 
the operation performed. For example, when an element is added to the object de- 
noted by child-set-of(B), the system searches the equivalence class of terms denoting 
that object. It finds the terms child-set-of(A) and child-set-of(B) and invokes the 
above daemon on both. 

Note that accessing terms in this way is also how the system determines the refer- 
ences for variables in daemons. Thus, in the above example it is from the terms 
child-set-of(A) and child-set-of(B) that bindings of A and B are established for y in 
the daemon. 

Invocation becomes more complicated when daemons have nested patterns in their 
WHEN conditions. Consider, for example, a daemon like 

WHEN op(F(G(x))) DO ... 
The following difficulty arises: When the operation op is executed, its argument 
will be an object, say A. In response, the system looks for terms denoting A of the 
given individual. 
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form F(y). The above daemon is associated with terms of this form. However, the 
daemon should not be invoked on all terms of the form F(y), only on those in which 
y is an object denoted by terms of the form G(x). As the nesting level in a dae- 
mon increases, the process of filtering out those terms to which a daemon should 
be applied involves more levels of "indirection." This can get quite expensive. For- 
tunately, nested general statements are rare in analytical reasoning problems and, 
consequently, so are nested daemons. 



7.2 Implementation of Library ADTs 

This section describes how the library ADTs individual, relation, function, and 
set are implemented. Except for 1-1 function, the rest of the library types are 
implemented as specializations of these. The implementation of 1-1 function is 
also described below. 

The ADT individual is used to represent domain individuals. It is used as a proto- 
type to create representations of sorts by defining subtypes such as 

(deftype family-member specializes individual disjoint). 
Individual is able to represent the fact that an individual can have more than 
one name. It does this with a single data field, called name, which is used to store a 
list of constants that name the individual represented by an instance. For example, 
the instance representing the individual B has the name field (B). Individual has 
several procedures associated with it (answers several messages). When an instance 
of individual receives an EQUATE message of the form 

EQUATE(other-object) 
it responds by returning a single instance whose name field is the union of its name 
field and other-object's name field. For instance, in the situation representing 

mother-of(A) = B 

mother-of(C) = D 
there are objects representing each of the individuals B and D. When the the state- 
ment mother-of(A) = mother-of(C) is added to this situation, one of these objects, 
say B, is sent the message 
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EQUATE(D). 2 The response to this message is to combine the objects representing 
B and D and return the single object whose name field contains the list (B D). 

Individual also supports an EQUAL? message of the form 

EQUAL?(other-object). 
When an instance receives such a message, it responds with true if its name field 
shares any elements with other-object's name field. If the instance and other-object 
are associated with equivalence classes that are marked disjoint, the instance responds 
with false. Otherwise it responds with unknown. 

In the FAMILIES problem, the different names given for individuals of sort 
family -member stand for different individuals. Therefore, the collection 

family-member is specialized to unique-individual. This ADT also has a name 
field, however, it can only contain a single name. An instance of unique-individual 
responds to an EQUAL? message with true if the names are the same or they are 
anonymous and associated with the same equivalence class; false if they have differ- 
ent names or are associated with equivalence classes that are marked disjoint; and 
unknown otherwise. 

An instance of unique-individual responds to an EQUATE message as follows. If 
both instances are anonymous or they have the same name, it does not matter which 
one is returned so the receiving instance returns itself. If one of the individuals is 
anonymous and the other is named, then the named individual is returned. If both 
individuals are named and the names are different, a contradiction is signalled. 

Note that the effect of equating an anonymous individual with a named individual 
is to name the anonymous individual. In a typical scenario, as situations get created 
anonymous individuals get named. 

The ADT relation is implemented as two lists of ordered n-tuples. Instances of 
relation are created to represent particular relations. An instance R created for a 
relation R contains two lists used to store n-tuples of individuals. One list in R is 
used to store the n-tuples of individuals known to stand in the relation R. The other 
list is used to store n-tuples of individuals known not to stand in the relation R. As 



2 Recall that D is the object representing D. 



180 CHAPTER 1. REPRESENTATION MACHINERY 

a problem situation is created n-tuples get added to these lists. 

Instances of the ADT function are used to represent functions. Unlike relations, 
instances of function do not store all pairs in a central list, i.e., the is no list associ- 
ated with a function, say mother-of , storing pairs of the form <x, mother-of (x)>. 
Instead each individual that has an image under a function has a pointer to its image 
element. The pointer is labeled with the function name. Thus, A = mother-of(B) 
is represented by the object representing B having a pointer to the object repre- 
senting A labeled mother-of. The single valued property of functions is enforced as 
follows. Each individual may have several function pointers (may have images under 
several functions), but may have only one pointer labeled with each function's name. 
If there is an attempt to make more than one object be the image of some object 
under a function, the system attempts to equate the two objects. This may cause a 
contradiction. 

Function terms can appear in problem statements that do not give names for the 
individuals they denote. Creating situations for these statements requires creating 
anonymous (unnamed) individuals. For example, the statement 

married(mother-of(A), father-of(B)), 
is represented by creating an anonymous family member, say x, to stand for 
the mother-of (A) and another anonymous individual, say y, to stand for the 
father-of(B). Then the system adds the pair <x,y> to married. 3 

Technically the ADT 1-1 function is not implemented as a specialization of 
function. Instead, the system creates an inverse for each 1-1 function. This 
can be named by a problem statement; if not, the system generates a name. Then 
whenever a functional relationship is added to a problem situation, if the function has 
an inverse, the inverse relationship is also added. This allows the existing mechanism 
enforcing the single valued property to also enforce the one-to-one property. 

The ADT set is instantiated to represent individual sets and subtypes are defined 
from it to represent sorts. As an example of its use to represent a sort, consider 
brother-set, the sort of sets of brothers of the same person. It is represented with 
a subtype whose instances are sets of brothers. Set has a three data fields: one 



This assumes, «f course, that married is represented as a relation. 
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for storing a list of the individuals known to be members of a particular set, one for 
storing individuals that are known not to be members of a particular set, and one to 
indicate whether a set is closed (i.e., all the members of that set are known). It has 
the following procedures associated with it: 

1. ADD-ELEMENT adds a new individual to a set unless the set is closed. In this 
case a contradiction is signalled unless the set already contains that individual, 
A contradiction is also signalled if the new individual is an element of the known 
non-members of the instance. 

2. ADD-NON-ELEMENT adds a new individual to the list of known non-members 
unless the individual is known to be a member. 

3. ASSIGN-TO-CONSTANT makes the list inside an instance equal to a given 
list and marks the instance closed. If the instance already contains elements, 
they must all be members of the constant set or a contradiction is signalled. 
Also, no elements of the constant set can be known non-members. 

4. EQUAL? takes another set instance as an argument and returns true if both 
instances are closed sets and have the same members; returns false if they 
are closed and do not contain the same members or if one set has known non- 
members that are members of the other; otherwise it returns unknown. 

Set also answers the EQUATE message. When an instance of set receives a message 
of the form 

EQUATE(other-object), 
it responds with a single instance that contains the union of the original instance's 
members and the members of other-object. The new instance's list of non-members 
is the union of the non-members of the originals. An EQUATE will signal a contra- 
diction unless one of the following is true: 

1. both original instances are closed and their lists of elements are equal 

2. exactly one of the original instances is closed, the elements of the open instance 
are also members of the closed instance, and none of the known non-elements 

.of the open instance are members of the closed instance 
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3. both instances are open, the non-members of one are disjoint from the members 
of the other, and vice versa. 

7.3 Solving the Example Problem 

This section explains how the small FAMILIES problem is solved with the represen- 
tation designed for it. The initial problem statement is shown in figure 7.1. The final 
problem statement is shown in figure 7.2. The boxes in the figure enclose statements 
captured by the specialized representation and the statements marked by (*) are 
captured by operationalization. The statement marked by (**) is captured in the in 
type hierarchy of representations. The final representation is shown in figure 7.3. 

P : family-member, Q : family -member, 

R : f amily -member , S : family-member 

grandchild(Q, S) 

\/x child-of(P,x) <=> x = R 

married(Q , P) 

Query: find-all x 1 parent(S,x) 

Figure 7.1: The small FAMILIES problem. 

The system creates a problem situation from the three specific statements found in 
the final formulation of the problem. Let us illustrate how it does this beginning with 
the statement 

children'(parent-set-of(P)) = {R}. 

To create a situation representing this statement, an instance P of family -member 
is created; an instance parent -set-1 of parent-set is created; and P is added to 
it, then the parent-set-of (P) is made parent-set-1. The system has, so far, 
created a situation representing parent-set-of (P). Next, a child-set is created, call it 
child-set-1, assigned the constant value {R}, and this object is made the children 
of parent-set-1. A diagrammatic version of the structure built for this statement 
is shown in figure 7.4. 

Next the system creates a representation of the statement 
couple(Q) — couple(P) A Q ^ P 
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children' '(par ent-set-of(P)) = {R} 

couple(Q) = couple(P) AQ^P 

3z[children'(parent-set-of(Q)) = child-set(z) A children'(parent-set-of(z)) = 

c/w/d-.se^.S')] 

\/xiy{child-of(x,y) ■& parent(y,x)\ 

(**)VxVy\parent-set-of(x) = parent-set-of(y) => couple(x) = couple(y)\ 

, + , \/x\/y\parent-set-of(x) ^J_ Acouple(x) = couple(y) =>• 

parent- set-of(x) = parent-set-of(y)} 
(*)Vx a; € parent-set-of(x) 
(*)Vaj a; G couple(x) 



Vx-i[couple(x) = couple(x) Aj^i] 

\/x\/y[couple(x) = couple(y) Ai^t/» couple(y) — couple{x) A x ^ y\ 
\/x\/yVz[couple(x) = couple(y) Ax ^ y A couple(y) = couple(z) Ay ^ r =>• 

couple(x) ^ couple(z) V i = :] 
\/xiyiz[c(mple(x) = couple(y) A x ^ y A couple(x) = couple(z) A x ^ z =?• y 
Vz child-set(x) ^ children' '(par ent-set-of(x)) 
Mxiy{ child-set(y) = children' '(parent- set-of(x)) => 
child-set(x) ^ chil dr en' (parent- set- of(y))) 

\/xiy\/z[(child-set(y) = children'(parent-set-of(x))A 
child-set(z) = children' (parent- set- of{y))) =>• 

child-set(z) ^ children' '(par ent-set-of(y))} 

VWyVzVW child-set(x) — children'(parent-set-of(y))A 

child-set(x) = children '(par ent-set-of(z)) A y ^ zA 
child-set(x)= children' (par ent-set-of(w)) =>• 

to = j/ V w = z] 



Query: find-the parents' (child- set-of(S)) 

Figure 7.2: Final formulation of the small FAMILIES problem, 
(deftype family -member specializes unique-individual disjoint) 
couple: function(family-member, married-couple) 

(deftype married-couple specializes fixed-size-disjoint-set(2, family)) 
(deftype child-set specializes disjoint-set (family-member) ) 
child-set-of : function(family-member, child-set) 
(deftype parent -set specializes 
fixed-size-dis joint-set (2, family-member)) 
parent -set -of: partial-function (family-member, parent-set) 
children' : l-l-partial-function(parent-set , child-set) 
parents' : l-l-function(child-set, parent-set) 

Figure 7.3: Final representation of the FAMILIES problem. 
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I 

Q 



Figure 7.4: Diagram of representation for children' '(parent- set-of(P)) = {R}. 

and adds it to the problem situation. To accomplish this, it creates an object Q 
and a couple containing Q. It creates another couple and adds the object P to it. It 
EQUATES the two couples just created producing a single couple containing P and 
Q. Note that since Q and P are instances of unique-individual, the second conjunct 
is already represented. 

Because couple(Q) - coup/e(P), the procedure that operationalization gener- 
ates to capture the constraint between couples and parent sets now makes 
parent-set-of(P) = parent-set-of(Q). This causes a single parent-set to be created 
containing P and Q. This situation is diagrammed in figure 7.5. Note that in the 
diagrams, the box with two slots is overloaded: it is used to represent both couples 
and parent sets. 



Q 


P 



Figure 7.5: Diagram of representation for couple(Q) — couple(P). 

The final step in creating the problem situation is to add the representation of the 
statement 

3z[ children(parent-set-of(Q)) = child-set(z)A 
children{parent-set-of(z)) = child-set(S)]. 

The representation of this statement alone is shown in figure 7.6. The existential 
variable z is represented in the actual structure with an anonymous instance of 
family-member, call that instance z. Since children' is one-to-one, as the system cre- 
ates the structure representing this statement, it enforces the one-to-one constraint 
by equating the objects denoted by child-set-of(z) with the child- set- of{R). Since 
an individual is always a member of his own child set and since child-set(R) = {#}, 
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the anonymous individual z is equated with R. This results in the structure shown in 
Figure 7.7. 



Q 



/ 



k 



v 



sy 



Figure 7.6: Diagram of representation for 3z[children(parent-set-of(Q)) 
child-set(z) A children(parent-set-of(z)) = child-set(S)}. 



Q 




Figure 7.7: Diagram of representation for all three statements. 



The system now answers the problem query by inspecting the parents of S which is 
a parent-set. The system knows that this set has two individuals but that it only 
knows one of them, R. So it answers the question with R and an indication that 
there are other members that it does not know. 
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Related Work 



This section highlights the differences between my research and previous work in the 
following areas: 

• Solving word problems 

• Automatic programming 

• Research on good representations 

• Problem reformulation 

• Mental models 

8.1 Solving Word Problems 

Since my representation design system solves word problems, one might expect there 
to be an interesting relationship between it and previous systems that do this. This 
section discusses the relationship between my research and two previous efforts to 
solve word problems and concludes that the relationship is only superficial: All of 
the systems solve word problems, but the underlying research objectives are very 
different. 

In [Bobrow68], the author reports on a program called STUDENT that solves high 
school algebra word problems. Bobrow was interested in issues of translating problems 
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stated in natural language to simple algebraic equations. He was also interested in 
a psychological model of high school student's performance in this activity and used 
his model to make predictions about human performance. 

The most important difference between STUDENT and my work is that the ob- 
jective of STUDENT was to translate a problem, stated in English, into a single 
pre-established target representation. My system is concerned with designing good 
target representations. For example, in most cases, confronted with a problem like 
those that STUDENT solved, I would expect my system to select algebraic equations 
as the target representation. 1 For high school algebra word problems, there would 
not be much representation design to do. However, given a different kind of problem, 
my system should (and does) design different kinds of representations. 

Another important difference between Bobrow's and my work is that he was interested 
in the psychological implications of his model. I have not concentrated on this in my 
work. Like many other efforts in AI the methods my system employs are inspired by 
human performance, but my emphasis has been on designing good representations 
regardless of whether or not I have captured the design process that people employ. 

The other word problem system I compare my system to is the work reported in 
[Novak76]. This system solved physics word problems in the area of rigid body statics. 
Like Bobrow's, this work was concerned with translating a problem, stated in English, 
into a single target representation and then solving it. However, Novak points out 
that physics problems of this type are not deductive. The hard part of solving them is 
figuring out what assumptions to make so that the problem decomposes into idealized 
pieces to which physical principles can be applied directly. By contrast, the problems 
that my system works on are deductive and do not require making sophisticated 
assumptions to simplify the situation presented in the problem. 

Another important difference between both of the research efforts discussed here and 
mine is that they considered the natural language translation problem an important 
part of what their systems did. I have not considered natural language translation 



l I have not tried my system on problems like those that STUDENT solves. However, my system 
does design representations in terms of equations. For example, part of the representation that is 
designed for the FAMILIES problem is a collection of equations relating sets of children, siblings, 
brothers, and sisters. 
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The distinction I have drawn between specifying programs and problems seems clear 
enough in the example just given. However, it becomes less clear for systems such as 
[Cohen86] whose input is a predicate calculus specification of a set to generate or a 
condition to test. Figure 8.2 gives an example of the input to Cohen's system which 
is called AP5. This specification describes more of what a programmer wants the 
machine to do than how it is to do it. 

(DeclareRel Sex 2) ; a binary relation 
; e.g., (Sex Sam Male) 

(DeclareRel Parent 2) ; (Parent child parent) 
(DefineRel Sibling (x y) 
(and (not (eq x y) ) 

(Exists (parent) 

(and (Parent x parent) 

(Parent y parent))))) 
(Defun list -nephews (person) 
(loop for nephew s.t. 
(and (Sex nephew 'male) 
(Exists (sibling) 

(and (Sibling person sibling) 
(Parent nephew sibling)))) 
collect nephew)) 

Figure 8.2: A specification input to AP5. 

Note that we could easily define a similar analytical reasoning problem that provides 
a definition of the sibling and nephew relation and then asks the question 

find-all x | nephew(P.x). 
However, AP5 and my system address very different problems. AP5 is given a spec- 
ification and a collection of annotations that select representations for the primitive 
relations in the specification. It compiles the specification given information it has 
about the costs of different ways of testing relations and for generating n-tuples of 
individuals standing in a relation. Once the person observes the behavior of the pro- 
gram that AP5 generates, he/she can choose better representations for the primitive 
relations and use AP5 to recompile the specification. The important point to notice 
is that., the person selects the representations and AP5 produces the best code it can 
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based on those selection. My system chooses the representations. 

I once conceived of my system producing a specification instead of a program. I had 
in mind that this specification would then be handed to an automatic programming 
system such as [Barstow79] (or possibly APS) which would then make data structure 
selection and algorithm optimization decisions. I was subsequently persuaded to 
produce a program instead. Still, the code generation part of my system does not 
try to perform data structure selection or algorithm optimization. For example, my 
system is concerned with deciding that some problem concept is best represented 
as a set. When a program is generated, the representation design system uses a 
default implementation for sets instead of trying to choose from among alternative 
implementations (e.g., a list or a bit vector) as, for example, [Rovner76] does. 

More conventional automatic programming systems such as QA3 (described in 
!Green68j) take the approach that automatic programming is a theorem proving 
activity in which the system tries to prove the existence of some entity that we want 
a program to produce. A human provides the system with a theory of the operations 
available for writing programs and when the system uses these to prove a theorem, 
it produces the desired program as a by-product. For example, to generate a pro- 
gram that sorts a list, QA3 tries to prove a theorem stating that for all (finite) lists 
there exists a sorted version. Since it has been provided with axioms that describe 
list operations and operations for comparing numbers, a by-product of the proof is a 
program that sorts lists. 

As Green and others have pointed out, the formulation of the set of axioms that 
describe both the operations available for programming and the desired program to 
be written can have a dramatic effect on whether the theorem proving approach 
succeeds. My work has concentrated on how better formulations of a set of axioms 
can be found automatically. 

Operationalization is the activity that my system engages in that is most similar to 
conventional automatic programming in that it transforms predicate calculus state- 
ments into code fragments. It is a restricted form of automatic programming that is 
used as a last resort in capturing problem constraints. It can not produce recursive 
programs and can only produce simple forms of iteration. 



192 CHAPTER 8. RELATED WORK 

8.3 Research on Good Representation 

As I have already stated, the principle difference between my research and previous 
efforts to understand what makes a good representation is that those efforts were con- 
cerned with recognizing the properties of good representations, while my research is 
about generating such representations prospectively. Still much of this work provided 
me with a good starting point and also helped me to sort out what the important is- 
sues about representation are. Several works give good explanations of what it means 
for a representation to be direct (or analogical) and what the consequences of having 
such a representation are (see [Sloman71, Sloman85, Hayes74, Lenat & Brown 84]). 

Pylyshyn's work, reported in ,Pylyshyn75], helped me to understand a phenomenon 
that I observed in the representations that people design to solve analytical reason- 
ing problems. Specifically, it does not make sense to talk about the directness of the 
structure of a representation devoid of the interpretation functions that give a seman- 
tics to that structure. For example, even in an obvious case like using a structure 
with two slots to represent married couples, it is really the interpretation provided by 
the procedures that manipulate couples that preserve the relationship between the 
representation and couples in the "real" world. 

8.4 Problem Reformulation 

Some early work on problem reformulation can be found in [McCarthy64, Newell65], 
and [Newell66|. Instead of trying to exhaustively compare my work with these, section 
concentrates on three works in the area. One work reported in [Amarel68] is included 
because of its influence in the area. The two other works [Korf80, Subramanian87] 
are included because they illustrate recent work in the area. Special attention is paid 
to [Korf80] because the work reported there is the most direct ancestor of mine. 

Amarel's work is a paper and pencil study leading us through a successive refinement 
of representations for the familiar Missionaries and Cannibals (M&C) problem. The 
work differs from mine in two ways. First, it considers a substantially different do- 
main: reasoning about the effects of actions. Second, it does not seriously consider 
issues in automating the search for a refinement sequence. The resulting characteriza- 
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tion of the transformations used and space of possible representations being searched 
is loose. In contrast, these issues have been at the forefront of my concerns. 

Amarel leads us through a sequence of refinements and, at each step, identifies some 
interesting properties of the M&C problem. Then he discusses a refined represen- 
tation that capitalizes on those properties. Many of the transformation discussed 
are informally motivated by arguments about the search space generated in finding 
solutions. For example, in the first version of the problem, the operators encode the 
possible moves and the non-cannibalization conditions are expressed as general con- 
straints on the problem. In the next version, the general constraints are "compiled" 
into the operators so that the new operators are applicable only when they pro- 
duce a non-cannibalized (legal) state. The resultant representation (with constraints 
compiled into the operators) is better because the search space is smaller. 

In a number of places in his argument, Amarel fails to define what is involved in the 
transformations he offers. For example, he begins with representations in terms of 
production systems in which rules construct new states from old ones. At one point 
he switches to a reduction system. Reduction systems begin with a problem stated 
as 

initial state =>■ final state. 
This is interpreted as "the final state is attainable from the initial state." The process 
of solving the problem involves "reducing" this statement to a sequence of states at- 
tainable by application of primitive operators. The problem with this reformulation 
(from production systems to reduction systems) is that we are not told how such a 
switch is made or when it is advantageous to do so. In fact, it turns out that this 
transformation is not advantageous for M&C. 

In another part of the paper, problem representations are described in terms of state 
space graphs. At one point, Amarel structures these graphs by viewing them as 
juxtaposed two dimensional grids. Unfortunately, the reader is left with little feeling 
as to what the precise nature of this transformation is and the conditions under which 
such a transformation is useful. In this presentation he exploits our visual abilities 
to notice certain properties. But what are these properties? And how did Amarel 
identify them in the problem? 
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Korf's work, reported in [Korf80i, has a similar mind set to that of Amarel's. How- 
ever, Korf went much further to develop a formalism for describing representations 
and transformations between different representations. He also provided a characteri- 
zation of two dimensions along which different transformations effect representations: 
viewing them as homomorphic and isomorphic transformations in a space of possible 
representations. Homomorphic transformations preserve structure and reduce infor- 
mation content. Isomorphic transformations preserve information content but change 
a representation's structure. 

An important contribution of his work is that Korf demonstrated how representation 
changes that had been previously viewed as "leaps of insight 1 ' could, in fact, be 
modeled as gradual refinement involving transformations of the type he identifies. 

One of his examples is the mutilated checker board. He describes transformations 
along the way to a representation consisting of two integers, one representing the 
number of uncovered black squares and one representing the number of uncovered 
red squares. One mapping involves assuming that the squares are indistinguishable. 
Then any situation in which the same number of squares are uncovered can be thought 
of as the same. This can be modeled as a homomorphic transformation that maps 
board situations into sets of indistinguished uncovered squares. Since all that is 
important about the sets of squares is their cardinality, the set representation can 
be transformed into one in which the sets are replaced by integers representing their 
cardinality. This transformation can be modeled as an isomorphic mapping between 
sets and integers representing their cardinality. 

In some ways my work can be seen as an extension of Korf's. He was concerned 
with characterizing a space of possible representations and types of transformations 
on them. My work is concerned with how to choose the right transformations to 
do to arrive at a good problem representation. I have identified some of the essential 
properties of representations and given a method to design representations with those 
properties. 

Korf (and Amarel) viewed problem solving as state space search and observed that 
changes in representation (i.e., the description of a problem state) affect the size of 
the space. The focus of my work has been explaining how representations do this 
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and how to design representations that yield smaller search spaces. The claim is that 
when a representation captures more constraints in its structure and behavior the 
problem solving space is reduced. 

One point that Korf does not make clear is that he actually describes two different 
kinds of homomorphic transformation: transformations on search spaces and trans- 
formations on state descriptions. An example of a transformation on a search space is 
introducing uninterruptible operator sequences (macro-operators, lemmas, etc.) and 
then removing the original operators. This has the effect of reducing a search space by 
''skipping over" intermediate states derived by the original operators. An example of 
a transformation on a state description is collapsing a checkerboard into two integers 
representing the cardinality of the set of red squares and black squares respectively. 
This has the effect of reducing a search space by throwing away distinctions in a state 
description and thereby grouping states into equivalence classes. These two different 
kinds of homomorphisms have the same effect on a search space: the overall size of 
the space is reduced. However, the reasoning involved in performing them is very 
different. 

My research has only considered one method of collapsing state descriptions in its 
use of the irrelevance filter (described in Chapter 3) which can be viewed as removing 
information from a state description that is irrelevant to solving a problem. One can 
imagine other homomorphic transformations (such as those suggested in Korf's fur- 
ther work section) whose effect is to remove information without changing a problem's 
solution. 

My research does consider homomorphic transformations on search spaces because 
it identifies specialized structures for representing. These structure have specialized 
procedures associated with them. Such procedures enforce consequences directly in 
a representation, skipping over intermediate steps necessary for a theorem prover to 
deduce those consequences. 

;Subramanian87i reports on a study in which logic is used as a tool to investigate 
properties of irrelevance. The paper discusses the application of their theory to 
three example problems including proving that a particular reformulation is justified 
because it removes only information that is irrelevant to solving a problem. This 
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example applies the theory to justify the removal of intermediate links in an ancestral 
tree for a problem that asks whether two individuals are in the same family. The 
intermediate ancestral links are irrelevant because all that matters to answering the 
question is the relationship between a person and the root of his/her family tree. 

In some ways this work is similar to my work on the irrelevance filter. There are 
two differences. First, my procedure for identifying and removing irrelevance is com- 
putationally tractable. Second, the modifications made to a problem in Subrama- 
nian's work are more subtle, e.g., viewing the removal of irrelevance as reformulation. 
The problem given in the paper is initially stated in terms of father, ancestor, and 
same family and is reformulated in terms of founding father and samefamily. This 
example provides one instance of a kind of reformulation different from the kind I 
have studied. My reformulations are guaranteed not to change the semantics of a 
problem. In contrast, the example given in this paper is a reformulation that changes 
the semantics of the problem while preserving the correctness of the solution. It 
would be interesting to pursue this type of reformulation in automatic representation 
design. 

8.5 Psychology and Mental Models 

There is a body of work in psychology on human mental models. The idea in this 
theory is that people solve problems by building and examining concrete integrated 
structures that are based on perceiving or imagining the events (situations) described 
in text[Johnson-Laird82j. The way these structures appear to be built and examined 
has some intriguing similarities to our specialized representations. 



Chapter 9 

Summary and Future Work 



9.1 Summary 

The contribution of this thesis is a technique that, given a problem, designs a repre- 
sentation specialized to a small class containing that problem. The representations 
designed are specialized because they capture the constraints of a problem class in 
their structure and behavior. When a problem solver uses a specialized representation 
to solve a problem, it is restricted so that it can express only situations in which the 
constraints of the class are satisfied. As a result, the space that the problem solver 
considers is significantly reduced, increasing problem solving efficiency. 

An abstract characterization of the design technique is as follows. The input to 
the process is a problem statement, the representation mapping for the concepts in 
that statement, and a collection of available structures with axioms describing the 
kinds of constraints that they can capture. The constraint on a concept is captured 
when it follows from the axioms of the structure representing it. The system tries 
to modify the representation mapping (i.e., to select different structures to represent 
the problem concepts) so that the constraints on those concepts are captured. 

The technique is implemented by three processes called classification, concept in- 
troduction, and operationalization. Classification and concept introduction run as 
coroutines to capture as many of the constraints of a problem as possible. Classifica- 
tion specializes the representation of individual concepts (relations, functions, etc.), 
while concept introduction introduces related concepts, attempting to find better for- 
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mulations in which classification captures more of the problem's constraints. Usually 
these processes are unable to capture all of a problem's constraints, so representation 
design includes operationalization, a "mopping up" process. It captures the con- 
straints left uncaptured by the previous processes by writing new procedures. Such 
procedures enforce constraints by responding when a constraint is violated, adding 
new information to a problem situation to reestablish the constraint. 

Classification has a library of structures which are implemented as parameterized 
abstract data types. These are used as prototypes for creating representations that 
enforce constraints through a combination of syntactic structure and behavior (pro- 
cedures). Representations created from more specialized structures capture more 
constraints. For example, two of the structures in the hierarchy are function and 
1-1 function. A representation created from 1-1 function is more specialized than 
one created from function because it captures the additional constraint that every 
range element is the image of at most one domain element. 

Classification "pushes" concepts down into a hierarchy of concept classes. Some of 
these classes have library structures associated with them. The library structure 
associated with a class is used to create representations for concepts in that class. 
Such representations capture all of the constraints of the class. For example, the 
hierarchy contains the class of all symmetric binary relations. This class has the 
structure sym-rel associated with it. Representations created from sym-rel capture 
the symmetry constraint. 

Successful classification identifies the most specialized structure (or collection of struc- 
ture) with which to represent the concept. When a representation is specialized, fewer 
situations can be expressed in it. For example, when "parents" is represented as a 
function we can express situations in which an individual has more than one pair 
of parents; when it is represented as a 1-1 function such situations can not be 
expressed. Thus specialized representations reduce the space that a problem solver 
must consider. 

Given a problem, classification comes up with a collection of maximally specialized 
representations for its concepts. However, classification by itself has a serious lim- 
itation: Its success depends on the particular vocabulary used to state a problem. 
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The FAMILIES problem, for example, is stated in terms of married, which is clas- 
sified beginning with relation. None of the specializations of a relation capture 
the fact that married couples are all of size two. However, if the problem had been 
stated in terms of couples, classification would have been more successful because a 
specialization of set takes advantage of size constraints. 

Concept introduction overcomes the limitation of classification by introducing related 
concepts when classification does not capture all the constraints of a concept. Con- 
cept introduction is implemented by rules that are attached to nodes in the taxonomy 
of library structures. When a node is reached that has an associated introduction 
rule, the rule is applied, introducing one or more new concepts. The effect of in- 
troducing a new concept is to represent it differently, giving classification access to 
different parts of the taxonomy: different library structures capture different con- 
straints and have different specializations. Representing a concept differently often 
allows classification to find a better fit between the constraints on that concept and 
the constraints captured by library structures. 

Introducing new concepts also reformulates a problem. This is accomplished by 
treating the logical definition of a new concept as a rewrite to perform on problem 
statements. Reformulation is useful for two reasons. First, it often allows the system 
to recognize new properties of a concept, facilitating further classification. Second, 
operationalization is often able to capture reformulated statements with more efficient 
procedures. 

As new concepts are introduced, the representation design system explores a space of 
alternative problem formulations. For example, introducing couple for married cre- 
ates two alternative formulations: one in terms of married and one in terms of couple. 
Alternative problem formulations are maintained because the representation design 
system can not tell whether an introduced concept will capture more constraints than 
an existing one until the new concept is fully classified. 

Classification extended by concept introduction is called extended classification. 
What is interesting about extended classification is that the two processes that are 
involved in it are fairly simple, however, the behavior of the combination of classifica- 
tion and introduction can result in sequences of reformulations that change a problem 
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significantly. For example, consider the introductions that result from extended clas- 
sification of married: 

1. The concept spouses is introduced. This is a function from individuals to the 
sets of individuals to whom they are married. 

2. The concept non-empty-spouses is introduced. This is a partial function from 
individuals to the non-empty sets of individuals to whom they are married. 

3. The concept spouse is introduced. This is a partial function that captures the 
fact that individuals have at most one spouse. 

4. The concept couple is introduced. This is a partial function from individuals 
to the married couple that they are members of. Couple captures the following 
facts: not all individuals are married, each married couple is disjoint from all 
other married couples, married couples contain exactly two members. 

Operationalization tries to capture the constraints of any statements remaining af- 
ter extended classification by writing new procedures and using these to specialize 
the representations created by classification and concept introduction. For example, 
suppose the statement 

\/xiy[x G siblings(y) <=> y G sibling s(x)] 
is left uncaptured and that siblings is represented as a function from individuals 
to their set of siblings (i.e., the range elements of siblings are represented as sets). 
Operationalization captures the constraint of this statement by writing procedures 
that watch for the addition of facts violating the statement's constraint. One such 
fact has the form x x G sibling s(yi). Accordingly, one procedure that operationaliza- 
tion writes watches for the addition of facts of this form and responds by adding a 
fact of the form y x G siblings(xi), reestablishing the constraint. 

When operationalization succeeds in writing a procedure like the one above for every 
fact whose addition to a problem situation can violate a statement's constraint, the 
statement is captured. 

When all of the constraints of a problem class are captured in a representation, it is 
used to solve the problem as follows. First, the specific statements of the problem 
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are expressed in the specialized representation. As this is done, a data structure is 
created that represents the situation described in the specific problem. For example, 
Q is married to P in the small FAMILIES problem. In the specialized representation 
designed for this problem, this is expressed as, "Q and P are in the same couple." 
When the relationship between Q and P is expressed in this way, an instance of 
couple, containing Q and P, is added to the problem situation. 

The data structure created when the problem specifics are expressed in a specialized 
representation is then inspected for the problem solution. For example, the question 
in the small FAMILIES problem is, "Who are the parents of S?" This question is 
answered by accessing the parents of S in the data structure created for the problem. 

The representation design system has been tested on eight analytical reasoning prob- 
lems. It successfully captures all the constraints of these problems. However, in 
general, representation design can fail to capture some constraints of a problem. In 
this case, the result of representation design is a specialized representation and a 
smaller collection of statements (the uncaptured ones) that the problem solver must 
reason about explicitly in that representation. In effect, the problem solver uses the 
specialized representation to accelerate the problem solving process in the same way 
that specialized reasoners have been used to accelerate theorem proving. 

9.1.1 Summary of An Example of Representation Design 

This section summarizes the design of the specialized representation for the small 
FAMILIES problem used as an example throughout this thesis. The problem is 
shown again in Figure 9.1. 

P : family-member, Q : family-member, 

R : f amily-member , S : family-member 

grandchild(Q , S) 

Vx child(P,x) <?4> x — R 

married(Q , P) 

Query: find-all x | parent(S,x) 

Figure 9.1: The small FAMILIES problem. 
Before representation design begins, the system tries to acquire definitions missing 
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from the problem statement, attempts to eliminate irrelevant information, and de- 
velops a description of the initial representation. These activities are detailed in 
chapter 3. 

For this problem, the representation design system begins by prompting the user for 

definitions of the concepts mentioned. The user supplies definitions for grandchild 

and parent: 

\/x\/y[grandchild(x,y) o 3:(child(x, z) A child(z,y))} 
V 'xv 'y\par ent{y , x) v=> child(x,y)\. 

Then, the system runs the irrelevance filter and finds that married is irrelevant. 
Before eliminating it, the system asks whether there are necessary or sufficient con- 
ditions for married in terms of other concepts mentioned. The user responds with 
both: 

\/x\/y^Jc{child{x. c) A child(y, c) A x ^ y => married(x,y)} 
\/xVyVc[married(x,y) A child(x,c) => child{y,c)\. 

With these statements added, married becomes relevant. 

Next, the system describes the problem's initial representation. In general, such 

descriptions include definitions for three types of concepts: the primitive concepts, 

concepts that have relevant mixed constraints on them, and concepts that appear in 

find-all queries. The description derived for the example problem is: 

(deftype family-member specializes unique- individual disjoint) 
married: relation (family-member, family-member) 
child: relation(family-member, family-member) 
parent : relation(f amily-member, family-member) 

Parent is included because it appears in a find-all query. Including it allows the sys- 
tem to consider the cost of answering the find-all query by considering the alternative 
formulations generated during extended classification of parent. 

Next, the system performs extended classification on the concepts whose representa- 
tions were defined above. We begin (arbitrarily) with married. Extended classifica- 
tion of married results in a sequence of introductions yielding the concept couple; 
details of this were given in section 5.5. 

As a result, a representation is defined for couple as 

couple: function(f amily-member, married-couple) 
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and the sort married-couple is defined as 

(deftype married-couple specializes 

f ixed-size-disjoint-set (2 , family-member) ) . 

The problem statement is also reformulated and a number of statements are added by 

knowledge acquisition activities. The new problem statement is shown in Figure 9.2. 

Note that the statements enclosed in the box in that figure are captured by couple. 

P : f amily-member , Q : f amily -member , 

R : f amily-member , S : f amily- member 

grandchild(Q, S) 

Vx child(P, x) <==■ x = R 

couple(Q) = couple(P) A Q ^ P 

"ixiy{grandchild{x.y) <=> 3z{child(x, z) A child(z,y))} 

ixiy\j)arent(y,x) <=> child(x, y)\ 

VxVyVcjc/M'W(x, c) A child(y 1 c) A x ^ y => couple(x) = couple(y) A x ^ y] 

\/x\/yyc[couple(x) = couple(y) A x ^ y A child(x,c) => child(y,c)} 



^[couple(x) = couple(x) A x ^ xj 

\/x\/y[couple(x) = couple(y) Ai/t/# couple{y) = couple(x) A x ^ y] 
VxVyV z[couple(x) = couple(y) A x ^ y A couple(y) = couple(z) A y f^ z => 

couple(x) -£ couple(z) V x — z\ 
\f xiyi z[couple(x) = couple(y) A x ^ y A couple(x) = couple(z) A s; / : =■ ]/ = :j 



Query: find-all x \ parent(S,x) 

Figure 9.2: Example problem reformulated in terms of couple 

The system now performs extended classification on child which results in the fol- 
lowing representations: 

(deftype child-set specializes dis joint-set (family-member)) 

child-set-of : function(f amily-member, child-set) 

(deftype parent-set specializes 

f ixed-size-disjoint-set (2, family-member) ) 

parent -set -of : part ial-f unction (family-member .parent- set) 

children' : l-l-partial-function(parent-set , child-set) 

parents' : l-l-function(child-set , parent-set) 

Again, statements get added to the problem and it gets reformulated. The result of 

this is the problem statement in Figure 9.3. As in the previous figure, the statements 

enclosed in the box have their constraints captured by the specialized representations. 

Also note that the two statements marked by (*) were derived by the rewrite system 

from the two statements 
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VxVyVc\child{x,c) A child(y, c) A x ^ y => couple(x) = couple(y) A x ^ y\ 
\/x\/y\/c[couple(x) = couple(y) Ai^yA child(x, c) ■=> child(y,c)} 

The first of these is recognized as a subsumption constraint between parent-set and 

couple and captured in the type hierarchy of representations. 

children'(parent-set-of(P)) = {/?} 

couple(Q) = couple(P) A Q 4 P 

3z[childr en '(parent- set-of(Q)) = child-set(z) A childr en' (parent- set-o f(z)) = 

child-set(S)] 

*ixiy\parent{x,y) ^ child-set-of(x) = children' (par ent- set(y))} 
(*)VxVy\parent-set-of(x) = parent-set-of(y) => couple(x) - couple(y)) 
, + . \/xVy\parent-set-of(x) ^L Acouple(x) = couple(y) => 

parent-set-of(x) = par ent- set-o f(y)\ 
ix par ent- set- of (x) 7^_L=> £ G parent-set-of(x)] 
ix[couple(x) ^L=s x 6 coup/e(a:)] 
Va; a; £ child-set-of(x) 



\/x\/y[couple(x) = couple(y) A x ^ y <£> couple(y) = couple(x) A x ^ y] 
\/xVy\/z[couple(x) = couple(y) Ax ^ y A couple(y) - couple(z) A y ^ z => 

couple(x) t^ couple(z) Vi = z] 
\/xVyVz[couple(x) = couple(y) A x ^ y A couple(x) - couple(z) A x ^ z => y - z] 
\/x child-set(x) -£ children' (parent- set-of(x)) 
VxVy[ child-set(y) = children' (par ent- set- of(x)) => 
child- set(x) ^ childr en' (parent- set-o f(y))] 

Vxiyiz[{child-set(y) = children'(parent-set-of(x))A 
child-set(z) = childr en' (par ent- set-of(y))) => 

child-set(z) ^ children'(parent-set-of(y))} 

Vx\/yVzVw[ child-set(x) = childr en' (par ent- set-o f(y)) A 

child-set(x) = childr en' (par ent- set-o f(z)) A y ^ zA 
child-set(x)— childr en' (par ent- set-o f(w)) =• 



Query: find-all x ^ parent(S,x) 

Figure 9.3: Formulation of example problem after child has been classified. 

Before proceeding with the extended classification of parent, the system attempts to 
create a fmd-the query from it that accesses an existing set, i.e., it transforms the 
find- all query into 

find-the {x \ parent(S,x)}. 
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This is expanded using the definition of parent into 

find-the {x j child- set- of(S) = children' '(par ent- set-o f{x))} . 
The rewriting system transforms this into 

find-the par ents' (child- set-o f (S)) . 
Since it has succeeded in deriving a find-the query that accesses an existing rep- 
resentation (a parent-set), the system expands any other occurrences of parent using 
its definition, removes its definition, and does not perform extended classification on 
it. 

If parent-set had not be created during the extended classification of child, the 
system would not have been able to simplify the original find-all query as shown. In 
this case, the system would proceed by performing extended classification of parent, 
introducing a concept for sets of parents and comparing the new formulation of the 
problem with the formulation in terms of parent. 

Extended classification now terminates and operationalization tries to capture any 
constraints left over. The uncaptured statements are: 

Vx x £ child- set- of(x) (1) 

\/x[couple(x) ^J_=> x e couple(x)} (2) 

V x[par ent- set-o f(x) ^±=> x G parent-set-of(x)] (3) 

^xVy\jpar ent- set-o f(x) ^_L Acouple(x) = couple(y) =? 

parent- set-o f(x) = par ent- set-o f(y)\. 

(1) is compiled into a daemon that responds to the creation of a range element, 
child-set-of(x), by adding x to that child set; (2) and (3) are compiled into similar 
procedures. The procedure generated for (4) does the following. When a parent set 
is created for an individual x his/her couple is made the same as the couple of the 
other individual in the parent set. 



9.2 Future Work 
9.2.1 Experimental Work 

The current system knowledge bases are the result of an analysis of twenty represen- 
tative analytical- reasoning problems (for a discussion of where these came from, see 
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Chapter 1). To date, the system has only been tried on three of these (and several 
variations of each, twelve problems in total). The most immediate continuation of 
the experimental work would be to try the other seventeen problems (and variations 
of them). Chapter 1 argues that this effort should not require any modification to 
the representation design algorithm and only small additions to the knowledge bases. 

Another avenue of experimentation is to study deductive problems other than an- 
alytical reasoning problems found on GREs. For example, I have begun studying 
Schubert's Steamroller. This problem is interesting because it is combinatorially dif- 
ficult enough that when it was first proposed (1978) no automated theorem provers 
were able to solve it. The problem has now been solved by a number of theorem 
provers and there have been discussions in the automated reasoning literature about 
different theorem proving strategies and formulations of the problem (see [Stickel86]). 

The systems that solve Schubert's Steamroller most efficiently are those based on 
sorted logics. Reformulation is required to get from the initial problem statement to 
an appropriate sorted formulation. It appears that my system performs many of the 
steps that people carry out manually to construct a sorted formulation of a problem. 
Of particular interest is the fact that when I hand simulated the representation design 
algorithm on Schubert's Steamroller, it found that information required to solve the 
problem was missing. These facts appear to have been assumed by the researchers 
who developed the sorted formulations manually. 

Another avenue for experimentation is to try the representation design system on 
some of the problems appearing over the years in the literature on problem reformu- 
lation, e.g., the Missionaries k Cannibals (M& C) problem or the mutilated checker- 
board problem. The system requires extension to work on problems that are solved 
by state space search: In its current form, it can only design representations for the 
individual states because it designs representations for problems that have a single 
state of affairs, whose deductive consequences are difficult to compute and are re- 
quired in answering the questions posed. The design of a good representation of 
the individual states in the M&: C problem turns out to be only a small part of the 
design effort. The most leverage is gained in this problem by exploiting properties 
of its search space. To apply my approach here requires my system be given (or to 
automatically generate) a representation of the initial search space, i.e., the space 
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induced by the initial representation of state and of the operators. 

9.2.2 Extending the Current System's Functionality 

There are a number of ways to extend the functionality of the current implementation. 
One way is to extend the query procedures. The current procedure for determining 
if a fact is necessarily true in a problem situation simply looks for that fact in the 
situation. Since the representations that the system designs are incomplete, a fact can 
follow from a situation but not be found by the above technique. Furthermore, unlike 
in a complete deductive system, such a fact can sometimes be proved by establishing 
that its negation is impossible. 1 

We can extend the system's current query procedures so that they are "more com- 
plete. '' The extended procedure for necessity does the following. It looks to see if the 
fact is represented in a situation. If so, it answers that the fact follows. Otherwise, 
it adds the negation of the fact to the situation. If a contradiction results, it also 
answers that the fact follows. If both of these methods fail, it reports that the truth 
value of the fact is unknown. If desired, it can be further extended to answer that a 
fact does not follow if its negation is present or adding the fact causes a contradiction. 

The other query procedures can be extended similarly. Note, however, that while the 
extended query procedures will be able to answer some questions that the originals 
could not. they will not alter the fact that a representation designed by the system 
may be incomplete and, in this case, the query procedures will not be complete. 

Another straightforward extension is to design a better evaluation function for es- 
timating the costs of problem formulations. As noted in Chapter 5. the technique 
for estimating the cost of a statement determines the number of loops in the pro- 
cedure generated by operationalization to capture the constraint of the statement. 
It produces an estimate of 1 if the procedure has no loops, n if is has one loop, n 2 
if it has a loop nested inside another, and so on. This is a fairly gross estimate of 
complexity which could be refined in a number of ways. For example, the procedure 
could produce a more general polynomial that had more than just a most significant 
term and also had estimated coefficients for the terms. Consider, for instance, the 



l In a complete deductive system, -O~>0 is equivalent to Dip. 
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statement 

VxVyVz[R(x,y) A R(y,z) => R(x,z)\ 
for which the current estimate will be n because one loop is required to find bindings 
for z each time a new pair < x,y > is related by R. The current procedure will give 
the same estimate for a similar statement that has a conjunction of two atomic for- 
mulas in the consequent. However, a more accurate estimate for the latter statement 
would be 'In because two facts are added to a problem situation for each binding of 

These and other factors could be taken into account by the estimator producing more 
accurate estimates and this might allow the system to make more refined decisions 
in comparing alternative formulations. 

Another extension is to add more sophisticated techniques for establishing the prop- 
erties of problem concepts during classification. This is especially important if one 
tries to extend the system into other domains. For example, section 9.2.1 pointed 
out that the current version of the system can not design representations that exploit 
properties of a state space. One reason for this is that more sophisticated techniques 
are required to establish properties of such a space. For instance, one very useful 
technique is to begin exploring a state space, look for suggested properties in a por- 
tion of the space, prove by induction over operator sequences that the property holds 
for the entire space, and then use that property to reformulate the problem before 
continuing to solve it. 

This last example suggests another important area of future work: integrating repre- 
sentation design more with the solution process. It appears that people refine their 
representations as they solve problems. Therefore, it seems likely that the system's 
performance can approximate a human's only if it can do the same. 



9.2.3 Representing Disjunction 

In general, the system can not design representations for disjunctive problem sit- 
uations. The difficulty is best illustrated by an example. Consider the following 
prepositional problem: 
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WvT 
Rv S 

(W/\R)=>Q 
(W A S) - Q 
(T AR)^-Q 
(TAS)=-Q 
Query: CQ. 

Q follows in this problem. However, the representation that the system designs will 

not determine this alone. The difficulty is that there is no way to create a problem 

situation representing W / T without committing to one of them actually being true. 

In this problem, the truth value of Q depends only on W V T and i? V S being true. 

One way to handle disjunction like this is to use the representation that the system 
designs to reason by cases. For example, a problem solver could use the above rep- 
resentation by assuming W A R and noting that Q follows; then assuming W A S 
and noting the same; and so on. By trying all possibilities, it could establish that Q 
follows. One reason that the system does not design representations that reason by 
cases to enforce constraint is that doing so is exponential. 

Even though the system does not design representations for arbitrary disjunctive 
situations, it can design representations of certain restricted forms of disjunction. 
For example, one form of disjunction that it can represent has to do with problems 
assigning a fixed set of objects to a set of slots in which each object can be in no 
more than a fixed number of slots. In this case, the system designs a representation 
for the set of possible objects assigned to each slot. In an initial problem situation, 
the possibility set for each slot contains all the possible values it can take on. The 
problem solving process is one of using the problem constraints to remove objects 
from these possibility sets. 

Figure 1.5 shows the PROFESSORS problem, a problem of this type. The represen- 
tation that the system designs for this problem is a sequence of ten offices. For each 
office the system represents the set of possible professors in that office. Initially, the 
possibility set for each office contains all the professors. Problem constraints are used 
to remove professors from these sets. For example, given the statement "Weis is in 
office 2," the representation removes all other professors from the possibility set for 
office 2 and removes Weis from all other possibility sets. 
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Appendix A 

The Rewriting System 



This appendix describes the statement rewriter used by all of the subprocesses of 
representation design. Representation design manipulates statements in a problem 
in a number of ways. The rewriter plays two important roles in these manipulations: 
as a statement simplifier and as a goal directed derivation mechanism. As a state- 
ment simplifier, it replaces terms in a statement with simpler equivalent terms. For 
example, any statement containing the term x e is rewritten, replacing that term 
by false. 

As a derivation mechanism, it supports classification by looking for cues in problem 
statements and attempting to derive a statement in a form expected by one of the 
subprocesses of representation design. One place this is used is in support of classifi- 
cation. For example, classification identifies a size constraint on a set by looking for 
a statement that bounds the set from above. The rewriter supports classification by 
trying to derive a statement bounding a set from above from statements containing 
certain equalities involving universally quantified variables. For instance, the system 
rewrites the statement 



WxVy\/z[y 6 spouses(x) Azt spouses(x) => y 



as 



\/x\/yVz[y 6 spouses(x) => spouses(x) C {y}}. 
Classification recognizes this statement as bounding sets of spouses from above. 

To ensure that the representation design process remains sound, all the rewrite rules 
have been checked for soundness, i.e.. I have shown for each rule that if it rewrites a 
statement <j> as tJ>, then <fi •=> 0. 

I have not attempted to show termination for the existing rule set. However, in 
practice the rewriter has always terminated and I believe it is possible to demonstrate 
termination formally. 
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A.l Statement Simplification 



As a statement simplifier, the rewriter consists of a collection of rules that look for 
patterns in statements as they are added to a problem description. When a rule's 
pattern is found in a statement, the rule rewrites it. As a simple example, one rule 
looks for a statement containing the literal x 6 0. Whenever a statement with this 
literal is added, the rule rewrites it, replacing that literal by false. 

To understand the effect that simplification can have on a problem, note that state- 
ments are added to problem descriptions at three points during representation design: 
(i) when the initial problem description is given to the system, (ii) when problem 
statements are reformulated, and (iii) during each intermediate step in operational- 
ization sequences. 

The rewriter can simplify statements given initially, but generally, there is not much 
simplification (of the sort performed by the rewriter) to do at first. It is unlikely, for 
example, that an initial statement would contain the term x G 0. The simplifier has 
most of its impact on statements generated by reformulation and operationalization 
because these processes make uniform changes in problem statements without regard 
for their meaning. For example, suppose a problem initially contains the statement 

Vx ->6r other (M,x) 
and that during design, the system introduces the concept brothers, a function map- 
ping an individual to his/her set of brothers. This causes every statement in the 
problem that mentions brother to be reformulated. Hence, the above statement gets 
reformulated as 

Vx x $ brother s(M). 
A rewrite rule transforms this statement into brother s{M) = 0. This rule has simpli- 
fied the statement because a statement with fewer variables in it is considered to be 
simpler than a logically equivalent statement containing more variables. There are a 
number of other rewrite rules that, like the example above, simplify statements by 
removing variables. 

Rules can also use the properties of the representations in a statement in simplifying. 
For example, one rule rewrites a subset relationship to an equality when it can deter- 
mine that the two sets involved have the same cardinality. For instance, the subset 
term in the statement 

VxVyVc[{x,j/} C parents(c) =• married(x,y)\ 
is rewritten to an equality because the range of parents is parent-set which is a 
collection of sets of size two. 
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A. 1.1 The Current Collection of Simplifying Rewrite Rules 

This section summarizes the current collection of simplifying rewrite rules. These 
are presented at the knowledge level, avoiding the details of the rewrite mechanism. 
Note that the expressions in the rules below are patterns to match against state- 
ments in the logic, not statements in the logic themselves. The existing rules divide 
roughly into three categories: (i) simplification of literals containing constant sets, 
(ii) simplification of literals based on more general knowledge about sets, and (iii) 
simplification to remove variables. 

Constant Set Simplification 

This is a representative rather than an exhaustive list of rules in this category. 

1. Replace x £ by false. 

2. When c and if are constants, replace c £ <p by true if c is an element of <p, and 
by false otherwise. 

3. When P and Q are constants, replace P U Q by the union of P and Q. 

4. Replace y>\ — S — <p 2 by S = <pi U (p 2 when fi and <p 2 are constant sets and it 
can be shown that <p 2 C S. 1 

General Set Simplification 

These are all the rules currently in this category. 

1. Replace {x \ x £ 5} by S. 

2. Replace Si C S 2 by S\ = S 2 , when card(Si) = card(S 2 ). The rule can de- 
termine the cardinality of a constant set by inspection of the statement, it 
determines cardinality of other set expressions by accessing the definitions of 
the representations in the expression. 

3. Replace {x j P A Q} by {x \ P} D {x | Q}. 

4. Replace {x ! P V Q} by {x \ P} U {x \ Q}. 

5. Replace {x j P A ^Q} by {x j P} - {x \ Q}. 



I Note that the system's matcher handles commutative operators and connectives and symmetric 
relations. Hence, this rule will also match a statement with a term of the form S — <p 2 = v?i because 
= is symmetric. 



214 APPENDIX A. THE REWRITING SYSTEM 

6. Replace {x | x = y} by {?/}. 

7. Replace S-y? = 0bySC v? . 

Variable Removal 

These are all the rules currently in this catergory. 

1. Replace x — x by true 

2. Replace x ^ x by false. 

3. Replace £ <£ 5 by 5 = when x is universally quantified. 

4. Replace x = F(y) A x = G(y) by F(y) = G{y) in a statement S if x does not 
appear elsewhere in S. 

5. When a statement contains the literal F(x) = j/, where y is a variable, replace it 
by true and substitute F(x) for all occurrences of y in the rest of the statement 2 . 

6. Remove the term t from the consequent of a statement of the form 

(t A an A • • • A a n ) => (t A 3 X A • • • A /3 m ). 

A. 2 Goal Directed Derivation 

The rewriter performs goal directed derivations to attempt to derive statements of a 
form that the subprocesses of representation design are expecting. For example, as 
explained in Chapter 4, classification looks for the properties of a problem's concepts 
by looking for statements of a predetermined form. The rewriter supports classifica- 
tion by attempting to derive statements in a form that classification will recognize 
as stating properties that it is looking for. In this mode, the rewriter is activated on 
particular statements, knows what form of statement it is trying to derive, and either 
succeeds in doing so or returns the statement to its original form. 

A rule that begins an attempted derivation executes when the pattern it is looking for 
is found in a problem statement. In response, it records that a derivation has begun, 
marks the statement involved, and records the form of statement it is trying to derive. 
The derivation process proceeds by repeatedly rewriting the marked statement until 
either the goal form is derived or no more rewrite rules can be applied. Also, rewrite 
rules can be restricted so that they are applied only to derivations with certain goals. 



2 This rule is generally known as patamodulation. 
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Here is an example derivation involving the statement 

Vx\/y\/z[y £ spouses(x) A z £ spoiises(x) =» y — z\. (1) 

One rule looks for a statement containing an equality in its consequent, where at 
least one side of the equality is a universally quantified variable. When the rule finds 
such a statement, it begins a derivation whose goal is a statement, concluding that a 
set is bounded from above. The goal in the example is a statement whose consequent 
is of the form spouses(x) C <p, where <p is any constant set. 

The general strategy that the system uses in this derivation is to attempt to rewrite 
the statement so that the consequent is an implication that can be transformed into 
the desired relationship, i.e, a statement of the form x £ S =*- x £ <p (where <p is a 
constant set) which is subsequently transformed into S C <p. 

The system implements this strategy using the following two rules (which are only 
applied to statements in derivations of this type): 

1. When the statement is of the form 

(x £ S A c*i A • • • A a n ) => x = y 
and a; is a universally quantified variable, 3 rewrite the statement as 
(a a A • • • A atn) => {x | x £ S} C {x \ x = y}. 

2. When the statement is of the form 

(* £ S A a x A • • ■ A a,,) =» (x = y x V • • • V x = y m ) 
and x is a universally quantified variable, rewrite the statement as 
(<*! A • • • A o^,) => {x | x £ S} C {x I x = y t V • • . V x = y m }. 

Continuing the example, the first rule rewrites (1) as 

VxVz[z £ spouses(x) =► {y | y £ sp<mses(x)} C {y \ y = z}]. 

This statement is further rewritten by a simplifying rewrite rule described in the last 
section. First as 

\fxiz[z £ spouses(x) =>• spouses(x) C {y \ y = z }\ 
and then as 

VxVz[z € spouses(x) =^> spouses(x) C {z}]. 



3 This pattern will match any statement whose antecedent is a conjunction containing a term 
matching *€ 5 because A is commutative. 
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